PDF Java Toolkit 9.2.0 (September 16, 2022)
Problem Corrections:
- SF#44747 - PDFJT no longer throws a NullPointerException when embedding a Type 1 font during conversion to PDF/A-2b
- SF#45088 - PDFJT no longer throws a NullPointerException for a font containing an invalid subtype value of OpenType
PDF Java Toolkit 9.1.1 (January 26, 2022)
Enhancements:
- Replace use of log4j2 with logback to mitigate CVE-2021-45105
PDF Java Toolkit 9.0.2 (October 13, 2021)
Enhancements:
- Update BouncyCastle to 1.69 to fix a vulnerability.
- Update Tika to version 1.27 to fix a vulnerability.
Problem Corrections:
- SF#44331 – PDFJT can now read badly escaped Cos name objects.
- SF#44037 – PDFJT no longer crashes when trying to get JavaScript actions from a form field that has a Widget style additional actions dictionary.
- SF#43982 – PDFJT can now clone the Properties dictionary from the source document when the PMMService.appendPages() method is used.
- SF#43948 – Text extraction no longer causes a NullPointerException when Optional Content information is missing. When extracting text, and the content stream contains a BDC for optional content, and the Optional Content Configuration Dictionary is missing, PDFJT will treat as a no-op, and the visibility of the text enclosed in BDC..BMC will not be altered.
- SF#43910 – PDFJT no longer gets an IllegalPathStateException if a PDF file attempts to paint a path without having created a path.
- SF#43873 – PDFJT no longer fails to convert a document with an unembedded subset font to PDF/A-2b.
- SF#44412 – PDFJT no longer gives an IndexOutOfBoundsException in the Wordafier constructor when enhancedOverstrikeProcessing is set to true.
- SF#44416 – PDFJT no longer gives a NullPointerException when reading PDF files that have DCT-encoded (JPEG) images that have unusual progressive settings.
- Talkeetna internal code was changed to be more reliable, based on improved code quality checking.
- GraphicState methods setFillColor() and setStrokeColor() now copy the color array for internal storage. This prevents unintended action if the array is then changed by the caller.
- Other OSS dependencies were upgraded, being transitive dependencies from pdfjt and the super pom.
- RELite internal code was changed to be more reliable, based on improved code quality checking.
- Update commons-io to 2.7 to fix vulnerabilities.
PDF Java Toolkit 9.0.1 (November 4, 2020)
Enhancements:
- PDF Java Toolkit updated to use OpenJDK 8.
- Updates to Open Source Software dependencies applied to mitigate identified vulnerabilities.
- Updates to XMP packages inside PDFJT with xmpcore 5.1.3 from Maven Central. Classes that used to reside in adobe.internal.xmpwill now be found in com.adobe.xmp.
Problem Corrections:
- SF#43104 – Corrects an issue where stack overflow happens while converting PDF to PDFA2b.
- SF#42388 – Corrects an issue where security authentication on documents using 256-bit encryption keys are not supported.
- SF#43266 – Corrects an issue where authenticating PDF documents that are password protected using RC4_40Bit encryption was not supported.
- SF#43397 – Corrects an issue related to vulnerabilities in Open source dependencies.
- SF#43484 – Corrects an issue where an exception would be thrown when incrementally saving a PDF larger than 2GB.
- SF#43506 – Corrects an issue where an exception would be thrown when the patch level of Java JRE is greater than 255.
- SF#43581 – Corrects an issue in creating 3D view by taking the U3D model out of one PDF and copying it into another PDF when matrix is null.
Version 8.16.3 (March 20, 2020)
Enhancements:
- Enhances PDFJT to handle CMap file with bfchar mapping data in a single line to avoid returning 0xFFFD value for all characters.
- Enhances PDFJT to handle CMap file with bfchar mapping data in a single line to avoid PDFInvalidDocumentException.
- Enhances the behavior of the appendPages() and overlayPages() methods of the PMMService class. This allows to create a result with the version of the source document if it is higher than the target document.
- Enhances PDFJT to handle CMap file with an extra line break between beginbfchar syntax.
Problem Corrections:
- Corrects an issue where a NullPointerException would be thrown when calling the appendPages() method of the PMMService class for certain documents.
- Corrects an issue where a ClassCastException would be thrown when calling the overlayPages() method of the PMMService class in succession.
- Corrects an issue where extracting text would result in an empty string due to parsing CMap files with codespaceranges or cidranges defined on a single line.
Version 8.16.1 (February 11, 2020)
Problem Corrections:
- Updates text extraction using fonts with missing ToUnicode table.
- Corrects an issue with finding valid code space ranges for a given char code.
- Corrects an issue where a NullPointerException would be thrown by a missing WritingMode entry. Horizontal is now the default writing mode.
- Corrects an issue where ArrayIndexOutOfBoundsException would be thrown if a PDF document contains more indirect objects than an array can hold.
- Corrects an issue with the insertion of a custom PDF/X schema during PDF/A conversion.
Version 8.16.0 (August 7, 2019)
Problem Corrections:
- Corrects an issue in which TrueType fonts with an encoding that has /Differences were not embedded and converted correctly during PDF/A-2b conversion.
- Resolves an issue in which converting to PDFA/2-b would fail due to incorrectly skipping routines that would fix-up documents containing TrueType fonts with an Encoding and the symbolic bit set on the Flags entry.
- Resolves an issue where PDFA/2-b conversion would result in a NullPointerException if the document contained a non-symbolic TrueType font with no Encoding entry.
- Resolves an issue where .notdef glyphs were not recognized in TrueType/OpenType fonts. They are now recognized if the glyph name is .notdef.
- Adds a new option to PDFA2ConversionOptions. The setReplaceNotdefWithSpace(boolean) method can be used to cause the conversion to replace .notdef characters with spaces, similar to what Acrobat and Preflight do. Defaults to false.
- Adds a new conversion handler hook 'notdefGlyphSubstitutedWithSpace' to report when substitution happens, and give an opportunity to discontinue processing.
- Resolves an issue where PDF/A-2b conversion would get a NullPointerException if a TrueType font didn't have a PostScript font name. It now falls back to the name of the font in the PDF file. PDF/A-2b now validates Type 0 TrueType fonts (CIDFontType2).
- When embedding a TrueType font as a Type 0 font during PDF/A-2b conversion, the emitted ToUnicode table is now compatible with Acrobat and results in correct text extraction.
Version 8.15.1 (June 7, 2019)
Problem Corrections:
- Resolves an issue where Unicode text with an embedded font did not extract correctly.
- Resolves inconsistency of rendered page between PDF Java Toolkit and Adobe Acrobat.
- Resolves an issue where text in alternate resources did not extract correctly.
Version 8.15 (March 6, 2019)
Enhancements:
- Reduces the threshold for spaces between words In the ReadingOrderTextExtractor from 0.5 of a space with to 0.4 of a space width, and the start of a new paragraph or block does not have an extra space.
- Updates the system so that, when converting images in ARGBImage, the system honors the Decode array in the image. This makes certain black-and-white documents are created with the correct polarity (not negative).
- Improves the detection of overstrike characters in the ReadingOrderTextExtractor when the overstriking text has a slightly different baseline than the overstruck text. This can be disabled by setting the TextExtractionOptions variable enhancedOverstrikeProcessing to false.
- Updates the software in the ReadingOrderTextExtractor to properly ignore overstruck text that is marked /Artifact in a marked context block.
- Updates to account for the horizontal scaling (Tz) when rendering text. This fixes a problem with content having each character flipped if the text state is set up something like /TT0 -12 Tf -100 Tz 1 0 0 -1 36 720 Tm. Note the negative font size and negative horizontal scaling.
- The balancePages(int maxNodes) feature can now take an optional number of children per node. The topmost node may have a few more or less entries than that value, but interior nodes will be limited to maxNodes. This can be used to alter the balancing of the page tree. balancePages() works as before, using a value of 5 for maxNodes.
Version 8.14.1 (September 5, 2018)
Enhancements:
- Updates completed to remove the NeedsRendering flag from documents where static XFA content has been flattened. See Table 28 of section 7.7 of the ISO 32000 reference, page 75.
Version 8.14.0 (April 5, 2018)
Enhancements:
- Updates the PDF Java Toolkit word position retrieval algorithms to better match output from Adobe Acrobat.
Version 8.13.1 (February 6, 2018)
Problem Corrections:
- Corrects a problem where a Null Pointer exception would appear when using the PMMService OverlayPages, with a source page with non-zero rotation.
- Corrects an issue rendering certain PDF documents that caused the document content to be shifted when rendering.
Version 8.13.0 (October 16, 2017)
Problem Corrections:
- Resolves an issue with writing JPEG (DCT) compressed images, with color spaces based on ICC profiles, to PDF files.
- Corrects a problem with rearranging pages in a PDF document using the movepage method in the PageTree, as it caused exceptions to be raised in certain circumstances.
- Improves the internal structure of output PDF documents generated by PDF Java Toolkit to allow for faster content extraction by PDFJT and by other systems.
- Updates the RenderPDF sample program to allow it to support writing PDFs with JPEG (DCT) compressed images properly.
Version 8.10.1 (August 17, 2017)
Enhancements:
- ContentTextItem now has a pair of Unicodes properties. The parameter lists the Unicode characters in the text.
- If an embeddable version of a Base 14 font is supplied in the PDFFontSet, such as a client providing a Helvetica or Times font, PDF Java Toolkit now uses that font for embedding and subsetting. The metrics are taken from the internal version of the Base 14 font for consistency, but the subsetted glyphs come from the supplied font.
- XFDFService now accepts XFDF files that have attributes on the field element that are not listed in the specification. Such attributes are ignored and allow PDF Java Toolkit to import a wider selection of XFDF files. XFDF files are an XML version of the Forms Document Format (FDF), a text export format from PDF documents. XFDF files are associated with Adobe Acrobat.
- Document-level JavaScript scripts are only run during formatting if a field actually has a format script when generating the appearances for a form field.
- When reporting unsupported JavaScript APIs, the original exception now states what function, on which class, is unsupported.
- Adds the removeMetatdata() method to the PDFDocument class. This function will take the existing metadata on a document and reset the metadata as if the document was just created. It is recommended to follow-up this call with a full save to prevent forensic discovery.
Problem Corrections:
- Fixes a bug where word breaks are not found in some cases for subsetted fonts that do not contain a space character.
Version 8.9.0 (July 27, 2017)
Enhancements:
- If one of the standard 14 fonts (PDF 32000-1:2008 section 9.6.2.2), such as Helvetica, exists in a font set, it will no longer be replaced by the built-in version of the font. This means that if you supply a font that can be embedded, annotation appearances will subset it as a Type 0 font.
- Introduces API in SignatureAppearanceOptions for customizing signature labels. The default label is “Digitally signed by {0}”, where {0} represents a name stored in UserInfo.
- Introduces InflaterCountingStreamWrapper as a superclass of TIFFCountingStreamWrapper.
- AppearanceService now generates rollover appearances for redaction annotations containing overlay text.
- Adds a method call to PDFFontTagRegistry to ensure that a font has a unique subset tag.
- Some low-level APIs in PDFFontUtils are deprecated. If your code used these APIs, please follow the deprecation warnings to find the correct calls on PDFFontTagRegistry.
- PDFOpenOptions has a new flag called DisableFontTagRepair. This flag allows for the enabling and disabling of repairs done to subset font tags.
- Adds PDFResourceIterator, which finds all the resources in the document.
Problem Corrections
- Corrects a problem with NullPointerExceptions where no InteriorColor was supplied for a PDFAnnotationRedaction.
- Corrects problems converting some numbers just less than a power of 10 to strings. This fixes some issues with creating ASNumber objects as well as creation of content streams.
- The setTabOrder() method under the PDFPage class now adds the Tabs entry if one is not already present.
- Fixes a crash when rendering certain kinds of inline images.
- Fixes a problem where images rotated 90 degrees were skewed more than 90 degrees.
- Ensures that very thin images, which would cover less than a device pixel, nevertheless are visible in the output rendered image.
Version 8.7.0 (May 22, 2017)
Enhancements:
- Adds the getFormattedValue() method under the PDFField class, to run the Format script on the field and return the result as a string.
- Deprecates the runFormatScripts() method under the FormFieldManager interface.
- Deprecates the JavaScriptHandler.execute() methods. Please see JavaScriptHandler to learn more.
Version 8.5.0 (March 13, 2017)
Enhancements:
- Calculation scripts are now run automatically when importing data with the FDFService or the XFDFService.
- Appearances for the standard set of rubber stamp annotations described in section 12.5.6.12 of the ISO 32000 specification are now automatically generated by calling generateAppearances() under the AppearanceService class.
- Pages that contain main axial shadings now render more quickly.
- When rendering, conversions from complicated color spaces to RGB may be faster as well.
Problem Corrections:
- PDFAuditor no longer causes a NullPointer exception for a page that has no content.
- PDFCIDFontWidths now works correctly if the font came from a MacOS resource file.
- The disableFontOperations method of the PDFFontListener class now sends a FLUSH_FONTS message to all DocumentListener objects so that they know that the fonts are being flushed out.
- The getDirtyCount() method of the CosObject class now returns the number of times that a CosObject has been modified; this can be used to test the validity of associated cached data.
Version 8.3.0 (February 6, 2017)
Enhancements:
- Adds the ability to execute JavaScript when importing data into forms using the FDFService or the XFDFService. A PDFJavaScriptException will be thrown if running the JavaScript causes an error.
- Improves rasterization of PDFs. Rendering of pages with a large number of image XObjects is now considerably faster, due to improved caching of color conversions.
Version 8.2.0 (December 5, 2016)
Enhancements
- Deprecates getUarrary() in the Word class in favor of getCharacters().
- In the PDFCharacter class, there is a new getBoundingQuad() method to efficiently get the bounding quad of an individual character.
- Updates manifest to ensure that PDF Java Toolkit can be loaded as an OSGi bundle.
Problem Corrections:
- Fixes a condition where page resources would be lost if they were added to an existing direct CosDictionary while adding new content to an existing page, using Talkeetna.
Version 8.1.0 (October 17, 2016)
PDF Java Toolkit now requires Java 7 or later.
Datalogics merged in changes Adobe Systems provided for PDF Java Toolkit through May of 2016.
Enhancements:
- Datalogics now provides support for the draft version of PDF 2.0 by adding the following packages:
- Adds the StampAnnotApGenerator class. Creates stamp annotations as found in Acrobat.
- Adds the OptimizerService class. Contains the ability to optimize the use of embedded fonts in a document.
- Adds the PDFA3Service class. Contains the ability to convert to and validate PDF/A3 archive documents.
Problem Corrections:
- When executing calculate scripts, the event.value is now set into the field itself until after the script completes and only if event.rc is true.
- Format scripts now more reliably receive the field value as a string, even if the value could have been parsed as a number.
- Fixes an internal library configuration problem where certain Charset operations would fail on Mac OS X/macOS.
- Fixes a problem where, when rasterized, rotated text sometimes had glyphs at an incorrect orientation.
- Fixes a problem in rasterization where the stroke width of paths could be incorrect if the transformation matrix specified a rotation.
- Fixes a problem where mixing simple and composite fonts in the same page caused an exception during conversion to PDF/A2.
Version 6.3.6 (August 29, 2016)
Problem Corrections:
- PDFAuditor now handles Form XObjects that lack a Resources dictionary. This had provoked a NullPointer exception.
- PDFAuditor now handles Form XObjects which contain recursive references to themselves. This was produced by some versions of Microsoft Word, and it had provoked a StackOverflow exception.
- PDFToRasterConverter no longer inappropriately crops the page when it is drawn at a smaller size (scaled down). This also improves the output of PageRasterizer.
Version 6.3.4 (July 11, 2016)
Enhancements:
- When trying to perform restricted operations on a password-protected PDF document without providing a password, a more descriptive message is now given with the error message.
- The License4j dependency has been removed from the non-license-managed version of PDFJT.
- PDFDocumentType now has documentation explaining each possible type.
- The event.value setting is now a string when calling JavaScript format events to reflect the way field formatting works in Adobe Acrobat.
- Flattening static XFA forms now removes the document JavaScript code that checks the version of Acrobat being used to open the PDF document. This eliminates a misleading warning message that the PDF is not supported when the document is opened in Adobe Acrobat.
Version 6.3.3 (June 6, 2016)
Problem Corrections:
- During sanitization, images that replace content are now at least 144 DPI. This is the resolution of the images created by Acrobat Pro XI when it sanitizes documents.
- Corrects errors in the message printed by the "--help" option in RELIte.
- Product now includes RELite as part of the customer deliverable rather than providing it separately. The evaluation certificate for RELite is no longer shipped as part of the customer deliverable. The documentation describes where to get the certificate.
Version 6.3.1 (May 16, 2016)
With this release from Datalogics, PDF Java Toolkit provides a new set of sample programs to replace the samples distributed with earlier versions of the product. The original samples for Talkeetna have been folded into these new samples.
Enhancements:
- Export com.adobe.agl.*, com.adobe.fontengine.*, com.adobe.internal.*, and com.adobe.xfa.* are included with the OSGi package
- An exception is now thrown when trying to convert a document with transparencies to PDF/A archive format.
- The PDFDocument.hasTransparency() method can now be used to detect transparencies in documents.
- CosDocument can now retrieve the linearization dictionary with the getLinearizationDictionary().
- Better descriptions are provided for:
- DocumentListener
- DocumentListenerProperties
- DocumentListenerRegistery
- DocumentMessage
- DocumentMessage.MessageType
- Improves documentation to describe PDFXObject, LicenseManager, and PDFPageTreeNode.
- Improves package overview for com/datalogics/pdf/text, under docs/talkeetna-docs. The content explains how the objects in the package relate to one another.
- Adds documentation for the Div.getDivs() method for Talkeetna.
Problem Corrections:
- Fixes bug with License-managed RELite that reported input file was not found when no license was present.
- Updates interactive prompting to use System.out for console message clarification for RELite.
Version 6.1.1 (April 11, 2016)
Enhancements:
- The Core PDF Java Toolkit is now OSGI compliant. The pdfjt.jar file provided with the software installation package is now an OSGI bundle.
- Replaces the pdfjt-support jar file with Maven dependencies. Some of the contents for pdfjt-support.jar were added to the pdfjt jar file.
- Adds the impl classes and packages to the JavDocs API Reference content.
- Improves ability to generate PDF/A-1b output files with the product. The new PDFAConversionOptionsFactory class has a method that returns a PDFAConversionOptions object that conforms to the PDF/A-1b standard. Also, the software now provides a PDF/A1-b conversion handler with default document conversion and validation processing settings. You no longer need to provide your own handler when converting a PDF document to PDF/A-1b.
- Signature field now displays the user name from the signatureOptions’ user info if it is set.
- Adds saveLinearAndClose() methods to DocumentHelper for Talkeetna.
- When creating a new instance of a PDFFontDescriptor, instead of estimating the ascent and descent lines based on the font’s bounding box, those values are now drawn from the font’s FontData.
- Adds a DCTDecode output filter to the product. PDF Java Toolkit can now be used to save images with the Discrete Cosine Transform compression format, used for rendering photographs as JPG images.
Problem Corrections:
- The applyRedaction method for the RedactionService class now throws a PDFUnsupportedFeature exception when attempting to redact a dynamic XFA document.
- PDF content streams containing an unusually long numeric object no longer cause an ArrayIndexOutOfBounds exception.
- When resources are included multiple times they will now be properly removed when freeing duplicate resources.
- The ReadingOrderTextExtractor and the LayoutModeTextExtractor classes, and TextExtractor newInstance method, now throw a PDFUnsupportedFeature exception when attempting to work with a dynamic XFA document.
- The setInteriorColor (double red, double green, double blue) of the PDFAnnotationRedaction class is now deprecated and has been replaced with versions more consistent with similar methods in PDF Java Toolkit.
- An exception will now be thrown when trying to rasterize a shell XFA file.
- Fixes ParseOps.skipWhitespace() to return a single space rather than the last character of the stream.
- Adds ASRectangle.swapDims() to switch height and width dimensions of a given ASRectangle.
- A more meaningful exception is now provided for garbled color spaces in content streams.
- An exception will now be thrown when there is a problem extracting text. Previously, the text was lost without comment from the system.
- Excludes org.apache.commons.collections4.iterators from OSGi imports.ARGBImage#ConvertToRGB() now throws a more descriptive exception when it can’t properly read the color space data of an image.
- The NeedAppearances key is now removed from the form only once.
- Throws PDFUnsupportedFeatureException when trying to rasterize a dynamic XFA rather than silently generating blank output.
- Throws PDFUnsupportedFeatureException when trying to flatten Dynamic XFA instead of a generic error message in the output PDF.
- Fixes an issue where PDFPage#nextPage would skip by two pages at a time.
- Removes PDF user documentation files from installation package.
- Fixes an issue where resampled images with DeviceCMYK color space are no longer inverted after resampling.
Version 4.7.0 (December 9, 2015)
Enhancements:
- Generating appearances with the AppearanceService class now supports generating bar codes with data drawn from XFDF form files for the PDF417 symbology, a standard bar code symbol format.
- Internal improvements to the PMMService class to speed up inserting pages into a PDF.
- Adds a "runFormatScripts" method to the FormFieldManager class that executes the formatting scripts for an AcroForm PDF form documents and returns the formatted data as a java.util.Map.
- The Talkeetna Element class (and its subclasses) now have a constructor that takes a Style object.