PDF Java Toolkit

Release Notes, PDF Java Toolkit

PDF Java Toolkit version 8.15.1 (June 7, 2019)
Problems fixed:

  • Resolved an issue where Unicode text with an embedded font did not extract correctly.
  • Resolved inconsistency of rendered page between PDF Java Toolkit and Adobe Acrobat.
  • Resolved an issue where text in alternate resources did not extract correctly.

PDF Java Toolkit version 8.15 (March 6, 2019)
Problems fixed:

  • In the ReadingOrderTextExtractor, the threshold for spaces between words was reduced from 0.5 of a space with to 0.4 of a space width, and the start of a new paragraph or block does not have an extra space.
  • When converting images in ARGBImage, the system was updated to honor the Decode array in the image. This makes certain black-and-white documents are created with the correct polarity (not negative).
  • In the ReadingOrderTextExtractor, the detection of overstrike characters was improved when the overstriking text has a slightly different baseline than the overstruck text. This can be disabled by setting the TextExtractionOptions variable enhancedOverstrikeProcessing to false.
  • In the ReadingOrderTextExtractor, software updated to properly ignore overstruck text that is marked /Artifact in a marked context block.
  • Update to account for the horizontal scaling (Tz) when rendering text. This fixes a problem with content having each character flipped if the text state is set up something like /TT0 -12 Tf -100 Tz 1 0 0 -1 36 720 Tm. Note the negative font size and negative horizontal scaling.
  • The balancePages(int maxNodes) feature can now take an optional number of children per node. The topmost node may have a few more or less entries than that value, but interior nodes will be limited to maxNodes. This can be used to alter the balancing of the page tree. balancePages() works as before, using a value of 5 for maxNodes.

PDF Java Toolkit version 8.14.1 (September 5, 2018)
Problems fixed:

  • Updates to remove the NeedsRendering flag from documents where static XFA content has been flattened. See Table 28 of section 7.7 of the ISO 32000 reference, page 75.

PDF Java Toolkit version 8.14.0 (April 5, 2018)
Problems fixed:

  • Updated the PDF Java Toolkit word position retrieval algorithms to better match output from Adobe Acrobat.

PDF Java Toolkit version 8.13.1 (February 6, 2018)
Problems fixed:

  • Corrected a problem where a Null Pointer exception would appear when using the PMMService OverlayPages, with a source page with non-zero rotation.
  • Corrected an issue rendering certain PDF documents that caused the document content to be shifted when rendering.

PDF Java Toolkit version 8.13.0 (October 16, 2017)
Problems fixed:

  • Resolved an issue with writing JPEG (DCT) compressed images, with color spaces based on ICC profiles, to PDF files.
  • Corrected a problem with rearranging pages in a PDF document using the movepage method in the PageTree, as it caused exceptions to be raised in certain circumstances.
  • Improved the internal structure of output PDF documents generated by PDF Java Toolkit to allow for faster content extraction by PDFJT and by other systems.
  • Updated the RenderPDF sample program to allow it to support writing PDFs with JPEG (DCT) compressed images properly.

PDF Java Toolkit version 8.10.1 (August 17, 2017)
Problems fixed:

  • ContentTextItem now has a pair of Unicodes properties.  The parameter lists the Unicode characters in the text.
  • If an embeddable version of a Base 14 font is supplied in the PDFFontSet, such as a client providing a Helvetica or Times font, PDF Java Toolkit will use that font for embedding and subsetting. The metrics will be taken from the internal version of the Base 14 font for consistency, but the subsetted glyphs will come from the supplied font.
  • XFDFService now accepts XFDF files that have attributes on the field element that are not listed in the specification. Such attributes are ignored and allow PDF Java Toolkit to import a wider selection of XFDF files. XFDF files are an XML version of the Forms Document Format (FDF), a text export format from PDF documents. XFDF files are associated with Adobe Acrobat.
  • Document-level JavaScript scripts are only run during formatting if a field actually has a format script when generating the appearances for a form field.
  • When reporting unsupported JavaScript APIs, the original exception now states what function, on which class, is unsupported.
  • Fixed a bug where word breaks are not found in some cases for subsetted fonts that do not contain a space character.
  • Added the removeMetatdata() method to the PDFDocument class. This function will take the existing metadata on a document and reset the metadata as if the document was just created. It is recommended to follow-up this call with a full save to prevent forensic discovery.

PDF Java Toolkit version 8.9.0 (July 27, 2017)
Problems fixed:

  • If one of the standard 14 fonts (PDF 32000-1:2008 section, such as Helvetica, exists in a font set, it won’t be replaced by the built-in version of the font. This means that if you supply a font that can be embedded, annotation appearances will subset it as a Type 0 font.
  • Introduced API in SignatureAppearanceOptions for customizing signature labels. The default label is “Digitally signed by {0}”, where {0} represents a name stored in UserInfo.
  • Corrected a problem with NullPointerExceptions where no InteriorColor was supplied for a PDFAnnotationRedaction.
  • AppearanceService now generates rollover appearances for redaction annotations containing overlay text.
  • Added a method call to PDFFontTagRegistry to ensure that a font has a unique subset tag.
  • Some low-level APIs in PDFFontUtils were deprecated. If your code used these APIs, please follow the deprecation warnings to find the correct calls on PDFFontTagRegistry.
  • PDFOpenOptions has a new flag called DisableFontTagRepair. This flag allows for the enabling and disabling of repairs done to subset font tags.
  • Added PDFResourceIterator, which finds all the resources in the document.
  • Corrected problems converting some numbers just less than a power of 10 to strings. This fixes some issues with creating ASNumber objects as well as creation of content streams.
  • The setTabOrder() method under the PDFPage class now adds the Tabs entry if one is not already present.
  • Fixed a crash when rendering certain kinds of inline images.
  • Fixed a problem where images rotated 90 degrees were skewed more than 90 degrees.
  • Ensured that very thin images, which would cover less than a device pixel, nevertheless are visible in the output rendered image.
  • Introduced InflaterCountingStreamWrapper as a superclass of TIFFCountingStreamWrapper.

PDF Java Toolkit version 8.7.0 (May 22, 2017)
Problems fixed:

  • Added the getFormattedValue() method under the PDFField class, to run the Format script on the field and returns the result as a string.
  • The runFormatScripts() method under the FormFieldManager interface is now deprecated.
  • The JavaScriptHandler.execute() methods are now deprecated. Please see JavaScriptHandler to learn more.

PDF Java Toolkit version 8.5.0 (March 13, 2017)
Problems fixed:

  • Calculation scripts are now run automatically when importing data with the FDFService or the XFDFService.
  • PDFAuditor no longer causes a NullPointer exception for a page that has no contents.
  • Appearances for the standard set of rubber stamp annotations described in section of the ISO 32000 specification are now automatically generated by calling generateAppearances() under the AppearanceService class.
  • PDFCIDFontWidths now works correctly if the font came from a MacOS resource file.
  • The disableFontOperations method of the PDFFontListener class now sends a FLUSH_FONTS message to all DocumentListener objects so that they know that the fonts are being flushed out.
  • Pages that contain main axial shadings now render more quickly.
  • When rendering, conversions from complicated color spaces to RGB may be faster as well.
  • The getDirtyCount() method of the CosObject class now returns the number of times that a CosObject has been modified; this can be used to test the validity of associated cached data.

PDF Java Toolkit version 8.3.0 (February 6, 2017)

  • Added the ability to execute JavaScript when importing data into forms using the FDFService or the XFDFService. A PDFJavaScriptException will be thrown if running the JavaScript causes an error.
  • Improved rasterization of PDFs. Rendering of pages with a large number of image XObjects is now considerably faster, due to improved caching of color conversions.

PDF Java Toolkit version 8.2.0 (December 5, 2016)

  • In the Word class, getUarray() is deprecated in favor of getCharacters().
  • In the PDFCharacter class, there is a new getBoundingQuad() method to efficiently get the bounding quad of an individual character.
  • Updated manifest to ensure that PDF Java Toolkit can be loaded as an OSGi bundle.
  • Fixed a condition where page resources would be lost if they were added to an existing direct CosDictionary while adding new content to an existing page, using Talkeetna.

PDF Java Toolkit version 8.1.0 (October 17, 2016)

PDF Java Toolkit now requires Java 7 or later.

Datalogics merged in changes Adobe Systems provided for PDF Java Toolkit through May of 2016.

New features:

Errors corrected:

  • When executing calculate scripts, the event.value is not set into the field itself until after the script completes and only if event.rc is true.
  • Format scripts now more reliably receive the field value as a string, even if the value could have been parsed as a number.
  • Fixed an internal library configuration problem where certain Charset operations would fail on Mac OS X/macOS.
  • Fixed a problem where, when rasterized, rotated text sometimes had glyphs at an incorrect orientation.
  • Fixed a problem in rasterization where the stroke width of paths could be incorrect if the transformation matrix specified a rotation.
  • Fixed a problem where mixing simple and composite fonts in the same page caused an exception during conversion to PDF/A2.

PDF Java Toolkit version 6.3.6 (August 29, 2016)

  • PDFAuditor now handles Form XObjects that lack a Resources dictionary. This had provoked a NullPointer exception.
  • PDFAuditor now handles Form XObjects which contain recursive references to themselves. This was produced by some versions of Microsoft Word, and it had provoked a StackOverflow exception.
  • PDFToRasterConverter no longer inappropriately crops the page when it is drawn at a smaller size (scaled down). This also improves the output of PageRasterizer.

PDF Java Toolkit version 6.3.4 (July 11, 2016)

  • When trying to perform restricted operations on a password-protected PDF document without providing a password, a more descriptive message is now given with the error message.
  • The License4j dependency has been removed from the non-license-managed version of PDFJT.
  • PDFDocumentType now has documentation explaining each possible type.
  • The event.value setting is now a string when calling JavaScript format events to reflect the way field formatting works in Adobe Acrobat.
  • Flattening static XFA forms now removes the document JavaScript code that checks the version of Acrobat being used to open the PDF document. This eliminates a misleading warning message that the PDF is not supported when the document is opened in Adobe Acrobat.

PDF Java Toolkit version 6.3.3 (June 6, 2016)
Problems fixed:

  • During sanitization, images that replace content are now at least 144 DPI. This is the resolution of the images created by Acrobat Pro XI when it sanitizes documents.
  • Corrected errors in the message printed by the “–help” option in RELIte.
  • RELite is now packaged as part of the customer deliverable rather than being provided separately. The evaluation certificate for RELite is no longer shipped as part of the customer deliverable. The documentation describes where to get the certificate.

PDF Java Toolkit version 6.3.1 (May 16, 2016)

With this release from Datalogics, PDF Java Toolkit provides a new set of sample programs to replace the samples distributed with earlier versions of the product. The original samples for Talkeetna have been folded into these new samples.

  • Export com.adobe.agl.*, com.adobe.fontengine.*, com.adobe.internal.*, and com.adobe.xfa.* are included with the OSGi package
  • An exception will now be thrown when trying to convert a document with transparencies to PDF/A archive format.
  • The PDFDocument.hasTransparency() method can be used to detect transparencies in documents.
  • CosDocument can now retrieve the linearization dictionary with the getLinearizationDictionary().
  • Better descriptions are provided for:
    • DocumentListener
    • DocumentListenerProperties
    • DocumentListenerRegistery
    • DocumentMessage
    • DocumentMessage.MessageType
  • Improved documentation is provided for PDFXObject, LicenseManager, and PDFPageTreeNode.
  • Improved package overview for com/datalogics/pdf/text, under docs/talkeetna-docs. The content explains how the objects in the package relate to one another.
  • Added documentation for the Div.getDivs() method for Talkeetna.
  • Fixed bug with License-managed RELite that reported input file was not found when no license was present.
  • Updated interactive prompting to use System.out for console message clarification for RELite.

PDF Java Toolkit version 6.1.1 (April 11, 2016)

  • The Core PDF Java Toolkit is now OSGI compliant. The pdfjt.jar file provided with the software installation package is now an OSGI bundle.
  • The pdfjt-support jar file was replaced by Maven dependencies. Some of the contents for pdfjt-support.jar were added to the pdfjt jar file.
  • The applyRedaction method for the RedactionService class now throws a PDFUnsupportedFeature exception when attempting to redact a dynamic XFA document.
  • PDF content streams containing an unusually long numeric object no longer cause an ArrayIndexOutOfBounds exception.
  • When resources are included multiple times they will now be properly removed when freeing duplicate resources.
  • The ReadingOrderTextExtractor and the LayoutModeTextExtractor classes, and TextExtractor newInstance method, now throw a PDFUnsupportedFeature exception when attempting to work with a dynamic XFA document.
  • The setInteriorColor (double red, double green, double blue) of the PDFAnnotationRedaction class is now deprecated and has been replaced with versions more consistent with similar methods in PDF Java Toolkit.
  • An exception will now be thrown when trying to rasterize a shell XFA file.
  • Fixed ParseOps.skipWhitespace() to return a single space rather than the last character of the stream.
  • Added ASRectangle.swapDims() to switch height and width dimensions of a given ASRectangle.
  • A more meaningful exception is now provided for garbled color spaces in content streams.
  • An exception will now be thrown when there is a problem extracting text. Previously, the text was lost without comment from the system.
  • Exclude org.apache.commons.collections4.iterators from OSGi imports.
  • The impl classes and packages were added to the JavDocs API Reference content.
  • Improved ability to generate PDF/A-1b output files with the product. The new PDFAConversionOptionsFactory class has a method that returns a PDFAConversionOptions object that conforms to the PDF/A-1b standard. Also, the software now provides a PDF/A1-b conversion handler with default document conversion and validation processing settings. You no longer need to provide your own handler when converting a PDF document to PDF/A-1b.
  • ARGBImage#ConvertToRGB() will now throw a more descriptive exception when it can’t properly read the color space data of an image.
  • The NeedAppearances key is removed from the form only once.
  • Signature field will display the user name from the signatureOptions’ user info if it is set.
  • Throw PDFUnsupportedFeatureException when trying to rasterize a dynamic XFA rather than silently generating blank output.
  • Throw PDFUnsupportedFeatureException when trying to flatten Dynamic XFA instead of a generic error message in the output PDF.
  • Fixed an issue where PDFPage#nextPage would skip by two pages at a time.
  • Removed PDF user documentation files from installation package.
  • When creating a new instance of a PDFFontDescriptor, instead of estimating the ascent and descent lines based on the font’s bounding box, those values are now drawn from the font’s FontData.
  • A DCTDecode output filter has been added to the product. PDF Java Toolkit can now be used to save images with the Discrete Cosine Transform compression format, used for rendering photographs as JPG images.
  • Resampled images with DeviceCMYK color space are no longer inverted after resampling.
  • Added saveLinearAndClose() methods to DocumentHelper for Talkeetna.

PDF Java Toolkit version 4.7.0 (December 9, 2015)

  • Generating appearances with the AppearanceService class now supports generating bar codes with data drawn from XFDF form files for the PDF417 symbology, a standard bar code symbol format.
  • Internal improvements to the PMMService class to speed up inserting pages into a PDF.
  • Added a “runFormatScripts” method to the FormFieldManager class that executes the formatting scripts for an AcroForm PDF form documents and returns the formatted data as a java.util.Map.
  • The Talkeetna Element class (and its subclasses) now have a constructor that takes a Style object.