PDF Java Toolkit

Release Notes, PDF Java Toolkit

Version 8.16.0 (August 7, 2019)
Problem Corrections:

  • Corrects an issue in which TrueType fonts with an encoding that has /Differences were not embedded and converted correctly during PDF/A-2b conversion.
  • Resolves an issue in which converting to PDFA/2-b would fail due to incorrectly skipping routines that would fix-up documents containing TrueType fonts with an Encoding and the symbolic bit set on the Flags entry.
  • Resolves an issue where PDFA/2-b conversion would result in a NullPointerException if the document contained a non-symbolic TrueType font with no Encoding entry.
  • Resolves an issue where .notdef glyphs were not recognized in TrueType/OpenType fonts.  They are now recognized if the glyph name is .notdef.
  • Adds a new option to PDFA2ConversionOptions. The setReplaceNotdefWithSpace(boolean) method can be used to cause the conversion to replace .notdef characters with spaces, similar to what Acrobat and Preflight do. Defaults to false.
  • Adds a new conversion handler hook ‘notdefGlyphSubstitutedWithSpace’ to report when substitution happens, and give an opportunity to discontinue processing.
  • Resolves an issue where PDF/A-2b conversion would get a NullPointerException if a TrueType font didn’t have a PostScript font name. It now falls back to the name of the font in the PDF file. PDF/A-2b now validates Type 0 TrueType fonts (CIDFontType2).
  • When embedding a TrueType font as a Type 0 font during PDF/A-2b conversion, the emitted ToUnicode table is now compatible with Acrobat and results in correct text extraction.

Version 8.15.1 (June 7, 2019)
Problem Corrections:

  • Resolves an issue where Unicode text with an embedded font did not extract correctly.
  • Resolves inconsistency of rendered page between PDF Java Toolkit and Adobe Acrobat.
  • Resolves an issue where text in alternate resources did not extract correctly.

Version 8.15 (March 6, 2019)
Enhancements:

  • Reduces the threshold for spaces between words In the ReadingOrderTextExtractor from 0.5 of a space with to 0.4 of a space width, and the start of a new paragraph or block does not have an extra space.
  • Updates the system so that, when converting images in ARGBImage, the system honors the Decode array in the image. This makes certain black-and-white documents are created with the correct polarity (not negative).
  • Improves the detection of overstrike characters in the ReadingOrderTextExtractor when the overstriking text has a slightly different baseline than the overstruck text. This can be disabled by setting the TextExtractionOptions variable enhancedOverstrikeProcessing to false.
  • Updates the software in the ReadingOrderTextExtractor to properly ignore overstruck text that is marked /Artifact in a marked context block.
  • Updates to account for the horizontal scaling (Tz) when rendering text. This fixes a problem with content having each character flipped if the text state is set up something like /TT0 -12 Tf -100 Tz 1 0 0 -1 36 720 Tm. Note the negative font size and negative horizontal scaling.
  • The balancePages(int maxNodes) feature can now take an optional number of children per node. The topmost node may have a few more or less entries than that value, but interior nodes will be limited to maxNodes. This can be used to alter the balancing of the page tree. balancePages() works as before, using a value of 5 for maxNodes.

Version 8.14.1 (September 5, 2018)
Enhancements:

  • Updates completed to remove the NeedsRendering flag from documents where static XFA content has been flattened. See Table 28 of section 7.7 of the ISO 32000 reference, page 75.

Version 8.14.0 (April 5, 2018)
Enhancements:

  • Updates the PDF Java Toolkit word position retrieval algorithms to better match output from Adobe Acrobat.

Version 8.13.1 (February 6, 2018)
Problem Corrections:

  • Corrects a problem where a Null Pointer exception would appear when using the PMMService OverlayPages, with a source page with non-zero rotation.
  • Corrects an issue rendering certain PDF documents that caused the document content to be shifted when rendering.

Version 8.13.0 (October 16, 2017)
Problem Corrections:

  • Resolves an issue with writing JPEG (DCT) compressed images, with color spaces based on ICC profiles, to PDF files.
  • Corrects a problem with rearranging pages in a PDF document using the movepage method in the PageTree, as it caused exceptions to be raised in certain circumstances.
  • Improves the internal structure of output PDF documents generated by PDF Java Toolkit to allow for faster content extraction by PDFJT and by other systems.
  • Updates the RenderPDF sample program to allow it to support writing PDFs with JPEG (DCT) compressed images properly.

Version 8.10.1 (August 17, 2017)
Enhancements:

  • ContentTextItem now has a pair of Unicodes properties.  The parameter lists the Unicode characters in the text.
  • If an embeddable version of a Base 14 font is supplied in the PDFFontSet, such as a client providing a Helvetica or Times font, PDF Java Toolkit now uses that font for embedding and subsetting. The metrics are taken from the internal version of the Base 14 font for consistency, but the subsetted glyphs come from the supplied font.
  • XFDFService now accepts XFDF files that have attributes on the field element that are not listed in the specification. Such attributes are ignored and allow PDF Java Toolkit to import a wider selection of XFDF files. XFDF files are an XML version of the Forms Document Format (FDF), a text export format from PDF documents. XFDF files are associated with Adobe Acrobat.
  • Document-level JavaScript scripts are only run during formatting if a field actually has a format script when generating the appearances for a form field.
  • When reporting unsupported JavaScript APIs, the original exception now states what function, on which class, is unsupported.
  • Adds the removeMetatdata() method to the PDFDocument class. This function will take the existing metadata on a document and reset the metadata as if the document was just created. It is recommended to follow-up this call with a full save to prevent forensic discovery.

Problem Corrections:

  • Fixes a bug where word breaks are not found in some cases for subsetted fonts that do not contain a space character.

Version 8.9.0 (July 27, 2017)

Enhancements:

  • If one of the standard 14 fonts (PDF 32000-1:2008 section 9.6.2.2), such as Helvetica, exists in a font set, it will no longer be replaced by the built-in version of the font. This means that if you supply a font that can be embedded, annotation appearances will subset it as a Type 0 font.
  • Introduces API in SignatureAppearanceOptions for customizing signature labels. The default label is “Digitally signed by {0}”, where {0} represents a name stored in UserInfo.
  • Introduces InflaterCountingStreamWrapper as a superclass of TIFFCountingStreamWrapper.
  • AppearanceService now generates rollover appearances for redaction annotations containing overlay text.
  • Adds a method call to PDFFontTagRegistry to ensure that a font has a unique subset tag.
  • Some low-level APIs in PDFFontUtils are deprecated. If your code used these APIs, please follow the deprecation warnings to find the correct calls on PDFFontTagRegistry.
  • PDFOpenOptions has a new flag called DisableFontTagRepair. This flag allows for the enabling and disabling of repairs done to subset font tags.
  • Adds PDFResourceIterator, which finds all the resources in the document.

Problem Corrections

  • Corrects a problem with NullPointerExceptions where no InteriorColor was supplied for a PDFAnnotationRedaction.
  • Corrects problems converting some numbers just less than a power of 10 to strings. This fixes some issues with creating ASNumber objects as well as creation of content streams.
  • The setTabOrder() method under the PDFPage class now adds the Tabs entry if one is not already present.
  • Fixes a crash when rendering certain kinds of inline images.
  • Fixes a problem where images rotated 90 degrees were skewed more than 90 degrees.
  • Ensures that very thin images, which would cover less than a device pixel, nevertheless are visible in the output rendered image.

Version 8.7.0 (May 22, 2017)
Enhancements:

  • Adds the getFormattedValue() method under the PDFField class, to run the Format script on the field and return the result as a string.
  • Deprecates the runFormatScripts() method under the FormFieldManager interface.
  • Deprecates the JavaScriptHandler.execute() methods. Please see JavaScriptHandler to learn more.

Version 8.5.0 (March 13, 2017)

Enhancements:

  • Calculation scripts are now run automatically when importing data with the FDFService or the XFDFService.
  • Appearances for the standard set of rubber stamp annotations described in section 12.5.6.12 of the ISO 32000 specification are now automatically generated by calling generateAppearances() under the AppearanceService class.
  • Pages that contain main axial shadings now render more quickly.
  • When rendering, conversions from complicated color spaces to RGB may be faster as well.

Problem Corrections:

  • PDFAuditor no longer causes a NullPointer exception for a page that has no content.
  • PDFCIDFontWidths now works correctly if the font came from a MacOS resource file.
  • The disableFontOperations method of the PDFFontListener class now sends a FLUSH_FONTS message to all DocumentListener objects so that they know that the fonts are being flushed out.
  • The getDirtyCount() method of the CosObject class now returns the number of times that a CosObject has been modified; this can be used to test the validity of associated cached data.

Version 8.3.0 (February 6, 2017)

Enhancements:

  • Adds the ability to execute JavaScript when importing data into forms using the FDFService or the XFDFService. A PDFJavaScriptException will be thrown if running the JavaScript causes an error.
  • Improves rasterization of PDFs. Rendering of pages with a large number of image XObjects is now considerably faster, due to improved caching of color conversions.

Version 8.2.0 (December 5, 2016)

Enhancements

  • Deprecates getUarrary() in the Word class in favor of getCharacters().
  • In the PDFCharacter class, there is a new getBoundingQuad() method to efficiently get the bounding quad of an individual character.
  • Updates manifest to ensure that PDF Java Toolkit can be loaded as an OSGi bundle.

Problem Corrections:

  • Fixes a condition where page resources would be lost if they were added to an existing direct CosDictionary while adding new content to an existing page, using Talkeetna.

Version 8.1.0 (October 17, 2016)

PDF Java Toolkit now requires Java 7 or later.

Datalogics merged in changes Adobe Systems provided for PDF Java Toolkit through May of 2016.

Enhancements:

Problem Corrections:

  • When executing calculate scripts, the event.value is now set into the field itself until after the script completes and only if event.rc is true.
  • Format scripts now more reliably receive the field value as a string, even if the value could have been parsed as a number.
  • Fixes an internal library configuration problem where certain Charset operations would fail on Mac OS X/macOS.
  • Fixes a problem where, when rasterized, rotated text sometimes had glyphs at an incorrect orientation.
  • Fixes a problem in rasterization where the stroke width of paths could be incorrect if the transformation matrix specified a rotation.
  • Fixes a problem where mixing simple and composite fonts in the same page caused an exception during conversion to PDF/A2.

Version 6.3.6 (August 29, 2016)

Problem Corrections:

  • PDFAuditor now handles Form XObjects that lack a Resources dictionary. This had provoked a NullPointer exception.
  • PDFAuditor now handles Form XObjects which contain recursive references to themselves. This was produced by some versions of Microsoft Word, and it had provoked a StackOverflow exception.
  • PDFToRasterConverter no longer inappropriately crops the page when it is drawn at a smaller size (scaled down). This also improves the output of PageRasterizer.

Version 6.3.4 (July 11, 2016)

Enhancements:

  • When trying to perform restricted operations on a password-protected PDF document without providing a password, a more descriptive message is now given with the error message.
  • The License4j dependency has been removed from the non-license-managed version of PDFJT.
  • PDFDocumentType now has documentation explaining each possible type.
  • The event.value setting is now a string when calling JavaScript format events to reflect the way field formatting works in Adobe Acrobat.
  • Flattening static XFA forms now removes the document JavaScript code that checks the version of Acrobat being used to open the PDF document. This eliminates a misleading warning message that the PDF is not supported when the document is opened in Adobe Acrobat.

Version 6.3.3 (June 6, 2016)
Problem Corrections:

  • During sanitization, images that replace content are now at least 144 DPI. This is the resolution of the images created by Acrobat Pro XI when it sanitizes documents.
  • Corrects errors in the message printed by the “–help” option in RELIte.
  • Product now includes RELite as part of the customer deliverable rather than providing it separately. The evaluation certificate for RELite is no longer shipped as part of the customer deliverable. The documentation describes where to get the certificate.

Version 6.3.1 (May 16, 2016)

With this release from Datalogics, PDF Java Toolkit provides a new set of sample programs to replace the samples distributed with earlier versions of the product. The original samples for Talkeetna have been folded into these new samples.

Enhancements:

  • Export com.adobe.agl.*, com.adobe.fontengine.*, com.adobe.internal.*, and com.adobe.xfa.* are included with the OSGi package
  • An exception is now thrown when trying to convert a document with transparencies to PDF/A archive format.
  • The PDFDocument.hasTransparency() method can now be used to detect transparencies in documents.
  • CosDocument can now retrieve the linearization dictionary with the getLinearizationDictionary().
  • Better descriptions are provided for:
    • DocumentListener
    • DocumentListenerProperties
    • DocumentListenerRegistery
    • DocumentMessage
    • DocumentMessage.MessageType
  • Improves documentation to describe PDFXObject, LicenseManager, and PDFPageTreeNode.
  • Improves package overview for com/datalogics/pdf/text, under docs/talkeetna-docs. The content explains how the objects in the package relate to one another.
  • Adds documentation for the Div.getDivs() method for Talkeetna.

Problem Corrections:

  • Fixes bug with License-managed RELite that reported input file was not found when no license was present.
  • Updates interactive prompting to use System.out for console message clarification for RELite.

Version 6.1.1 (April 11, 2016)

Enhancements:

  • The Core PDF Java Toolkit is now OSGI compliant. The pdfjt.jar file provided with the software installation package is now an OSGI bundle.
  • Replaces the pdfjt-support jar file with Maven dependencies. Some of the contents for pdfjt-support.jar were added to the pdfjt jar file.
  • Adds the impl classes and packages to the JavDocs API Reference content.
  • Improves ability to generate PDF/A-1b output files with the product. The new PDFAConversionOptionsFactory class has a method that returns a PDFAConversionOptions object that conforms to the PDF/A-1b standard. Also, the software now provides a PDF/A1-b conversion handler with default document conversion and validation processing settings. You no longer need to provide your own handler when converting a PDF document to PDF/A-1b.
  • Signature field now displays the user name from the signatureOptions’ user info if it is set.
  • Adds saveLinearAndClose() methods to DocumentHelper for Talkeetna.
  • When creating a new instance of a PDFFontDescriptor, instead of estimating the ascent and descent lines based on the font’s bounding box, those values are now drawn from the font’s FontData.
  • Adds a DCTDecode output filter to the product. PDF Java Toolkit can now be used to save images with the Discrete Cosine Transform compression format, used for rendering photographs as JPG images.

Problem Corrections:

  • The applyRedaction method for the RedactionService class now throws a PDFUnsupportedFeature exception when attempting to redact a dynamic XFA document.
  • PDF content streams containing an unusually long numeric object no longer cause an ArrayIndexOutOfBounds exception.
  • When resources are included multiple times they will now be properly removed when freeing duplicate resources.
  • The ReadingOrderTextExtractor and the LayoutModeTextExtractor classes, and TextExtractor newInstance method, now throw a PDFUnsupportedFeature exception when attempting to work with a dynamic XFA document.
  • The setInteriorColor (double red, double green, double blue) of the PDFAnnotationRedaction class is now deprecated and has been replaced with versions more consistent with similar methods in PDF Java Toolkit.
  • An exception will now be thrown when trying to rasterize a shell XFA file.
  • Fixes ParseOps.skipWhitespace() to return a single space rather than the last character of the stream.
  • Adds ASRectangle.swapDims() to switch height and width dimensions of a given ASRectangle.
  • A more meaningful exception is now provided for garbled color spaces in content streams.
  • An exception will now be thrown when there is a problem extracting text. Previously, the text was lost without comment from the system.
  • Excludes org.apache.commons.collections4.iterators from OSGi imports.ARGBImage#ConvertToRGB() now throws a more descriptive exception when it can’t properly read the color space data of an image.
  • The NeedAppearances key is now removed from the form only once.
  • Throws PDFUnsupportedFeatureException when trying to rasterize a dynamic XFA rather than silently generating blank output.
  • Throws PDFUnsupportedFeatureException when trying to flatten Dynamic XFA instead of a generic error message in the output PDF.
  • Fixes an issue where PDFPage#nextPage would skip by two pages at a time.
  • Removes PDF user documentation files from installation package.
  • Fixes an issue where resampled images with DeviceCMYK color space are no longer inverted after resampling.

Version 4.7.0 (December 9, 2015)

Enhancements:

  • Generating appearances with the AppearanceService class now supports generating bar codes with data drawn from XFDF form files for the PDF417 symbology, a standard bar code symbol format.
  • Internal improvements to the PMMService class to speed up inserting pages into a PDF.
  • Adds a “runFormatScripts” method to the FormFieldManager class that executes the formatting scripts for an AcroForm PDF form documents and returns the formatted data as a java.util.Map.
  • The Talkeetna Element class (and its subclasses) now have a constructor that takes a Style object.