PDF Checker

Description of JSON profile parameters

general: unable-to-open This is not a valid PDF document, or it has been corrupted to the point that it cannot be displayed in a browser or viewing tool.
general: password-protected The file cannot be opened without a password. To open the file, provide the password in a command line argument.
general: contains-owner-password When you create a PDF document you can restrict the ability of others to work with that document, and add a PDF Owner Password to the file to secure those settings. Other users will be able to open and read your PDF document, but without this password they can’t change your restrictions. With the Owner Password you can allow others to open and read your document but stop them from printing the file, copying content, adding changes or comments, adding or extracting pages or graphics, signing the document, or making other changes.
general: xfa-type XFA, or XML Forms Architecture, is a set of proprietary XML specifications for use with web forms. XFA forms are saved internally in PDF files, and the standard is owned by Adobe Systems. The original PDF forms technology, Acrobat Forms or Acroforms, was created by Adobe Systems in 1998. Most PDF documents use Acroforms rather than XFA, because Acroforms are compatible with a much wider range of software applications, as well as with Acrobat itself.
general: pdf-v2 The PDF format was published as an open file format by the International Organization for Standardization (ISO) in 2008. Version 2.0 of PDF was released in July of 2017. PDF Checker can identify if a PDF document was created recently, as a PDF 2.0 document.
general: contains-signature A PDF document can contain one or more digital signatures, and these signatures can be verified by a vendor known as a Certifying Authority. If a PDF document has a certified digital signature it can be used as a legal document. It is also possible to lock a signed PDF document against any further changes.
general: claims-pdfa-conformance A PDF/A, or PDF Archive document, is a type of PDF file that is designed to be stored so that it can be accessed for many years to come. PDF/A documents must be able to be opened and read using viewing tools available in the future, so they are designed to be self-contained. For example, all of the fonts used in a PDF/A document must be embedded within the PDF document itself for the file to be considered PDF/A compliant. PDF Checker can identify if a PDF document is considered PDF/A compliant. Note that the software does not verify that the PDF is compliant, but rather, that is has been saved as such. It is possible for a PDF document to be labeled as compliant when in fact the file has been later altered, making the PDF/A compliance no longer valid.
general: damaged PDF Checker found that the PDF document was corrupted, and on its first attempt the system was not able to open it. In response PDF CHECKER sought to repair the file so that it could be opened.

After repairing the file PDF Checker seeks to open the file again. If PDF Checker still can’t open the PDF document, no further processing is possible. The report will simply indicate that the document is damaged.

If PDF Checker can repair and open a damaged document, the system will process as many review checks as possible and report on any review steps that complete properly. If PDF Checker cannot complete any checks, the system will stop processing the document and report those review steps as aborted. The failed steps might be due to the document being damaged.

cleanup: suboptimal-compression A data stream in a PDF document contains text, an image, or an object, with instructions on how the content will be rendered on the page. These data streams can be compressed in a document to make the PDF smaller and more portable. This check looks for data streams in the input document that are not compressed, or that are using a simple algorithm that is not as efficient in compression, such as ASCII, or Run Length, or LZW.

PDF Optimizer can be used to improve the compression used in a PDF document to reduce the size of the file.

fonts: uses-fonts-not-embedded

fonts: uses-base14fonts-not-embedded

fonts: uses-fonts-fully-embedded

It is a best practice when working with PDF documents to embed, or save, every font used in a PDF document in the document itself. That way, a viewing tool (like Acrobat) does not have to look for a font stored on the local system or choose a substitute font. Use this setting to find out if a PDF is using fonts that are not embedded. Removing an embedded font file from a PDF document can make the document smaller, but this practice can also make the PDF load more slowly. It might also change the appearance of the file if the viewing software cannot find a font it needs on the local machine nor a suitable substitute.

PDF Checker looks in the /FontDescriptor directory, or the font dictionary, within the PDF document, to identify the fonts that are in use in that document. Then, it looks to see if those same font files are embedded in that document, or if the viewer will need to access a font from the host machine.

Fonts that are not embedded in the PDF document, either Base 14 fonts or otherwise, and fonts that are embedded, can be listed in the results.

PDF Optimizer can be used to subset fonts that are fully embedded in PDF files. Subsetting fonts can reduce the size of PDF files by reducing the fonts that are in those PDFs to only those characters that are actually used in the PDF. However, subsetting can impact the ability to later edit the text within a PDF file. Subsetting should be used when a PDF is intended for final-form distribution rather than for editing.

fonts: fontdescriptor-missing-fields

fonts: fontdescriptor-missing-capheight

A font descriptor describes the characteristics of a font, as opposed to the widths and characteristics of individual glyphs (characters) within that font set. Font descriptor values include the name of the font, the angle in degrees used for creating italic characters, the maximum height above the baseline that a glyph can reach in this font (ascent) and the maximum depth below that it can reach (descent), and a variety of other values.

If one of the required font descriptor settings is not included in the font descriptor dictionary for a PDF document, PDF Checker can determine that and list the missing values. If a font descriptor value is not provided the PDF document may not work properly in some applications.

The CapHeight is the coordinate showing the placement of the tops of flat capital letters, such as T or R, as measured from the baseline. It is required for all fonts that have Latin characters except for Type 3 fonts (that use PostScript). But as it is hard to determine if a given font has Latin characters, it is possible with a standard font descriptor search for the CapHeight value to be overlooked. To avoid that, PDF Checker searches for CapHeight separately.

objects: contains-javascript-actions This PDF document contains blocks of JavaScript code that actions that may alter appearance of the document. Common JavaScript actions in PDF documents include submitting a form (a Submit button), accessing a web site from a web address, or sending the document to a printer.

PDF Optimizer can be used to remove JavaScript actions from PDF documents. This can reduce the size of the PDF file and increase compatibility across PDF viewers and processors.

objects: contains-thumbnails Thumbnail images are used to preview pages in a PDF document and appear in a panel on the left side of the viewer window. A user could scroll through a series of thumbnails to find a page he or she is looking for.

PDF Optimizer can be used to remove thumbnails from PDF documents. This can reduce the size of the PDF file.

userdata: contains-annots Annotations are changes you can add to a PDF document, such as notes, highlighted text, file attachments, crossed out text, and text callout boxes.
userdata: contains-annots-not-for-viewing Annotations can be added to a PDF document and hidden, so that they do not appear when the document is opened in a viewing tool.
userdata: contains-annots-not-for-printing Annotations can be added to a PDF document so that they appear on the page in a viewing tool but are not included when the document is sent to a printer.
userdata: contains-annots-without-normal-appearances Every annotation included in a PDF document features an optional entry that describes what the annotation will look like when the document is rendered in a viewer. Generally, this value is not provided, so if the PDF is opened in Adobe Acrobat, Acrobat will fill in the appearance, based on what the value should be. When you open a PDF document in Adobe Acrobat or Adobe Reader and this viewer fills in the appearance value, you will be prompted to save the file when you close it.  If you do save the file, the annotation appearance is made a part of the updated PDF document.  This is the normal annotation appearance.
userdata: contains-optional-content Optional content in a PDF document are layers of graphics objects or annotations that can be made to appear or disappear when the PDF is opened in Adobe Acrobat or Reader, using the Layers pane.

PDF Optimizer can be used to remove these extra layers from PDF documents. This can reduce the size of PDF files and increase compatibility across PDF viewers and processors.

userdata: contains-transparency It is possible to stack objects, such as graphics, images, text boxes, and form fields, on top of each other on a PDF document. These objects can be partially or fully transparent, and thus can interact in various ways with objects behind them. If a set of transparencies are stacked in a PDF file, each one contributes to the final result that appears on the page, such as the colors blending together into a final color that appears.

PDF Optimizer can be used to flatten the transparency in a PDF document. Flattening a transparency substitutes a representation of transparent content in that document, merging and blending together the appearance of transparent content. This can be used to improve the compatibility of PDFs across PDF viewers and processors, and to decrease the amount of time required to render and print PDF pages using transparency.

userdata: contains-private-data Some applications, like Adobe Illustrator, add their own unique values to a PDF document when generating that document. These values are useful to the original software product if the PDF is opened and edited in that product again.

PDF Optimizer may be used to remove private data from PDF files. Removing private data can reduce the size of a PDF document, though it will remove the ability for some applications to edit these PDF files. This should be used for PDF documents that are intended for final-form distribution, rather than for PDFs that are intended for editing.

userdata: contains-metadata Most PDF documents store information that describes that document, such as the author, creation date, and the software used to generate the file.

PDF Optimizer may be used to remove much of the document information and metadata from PDF files, to reduce the size of a PDF document. A minimal set of information will be maintained to ensure maximum compatibility with PDF viewers and processors.

userdata: contains-embedded-files PDF documents can hold other files that are embedded or attached in that document, including other PDF documents, email messages, spreadsheets, graphics files, and the like.

PDF Optimizer may be used to remove embedded files and attachments from PDF documents. This can reduce the size of PDF files significantly, for PDF files where these attachments are not desired or are not intended to be used.

images:color PDF Optimizer provides a variety of different methods for optimizing images that can reduce the size of color images. These optimizations include the ability to downsample (reduce the resolution of) images, the ability to change the compression of color images, and the ability to reduce the color depth of images. These optimizations can be controlled specifically for color images.

Also, PDF Optimizer can be used to perform color conversion on color images in PDF files. This can be used to normalize the color representations used in a PDF document for faster processing, improved compatibility, and decreased file sizes. And color conversion may also be used to transform PDFs into grayscale renditions. This can dramatically speed up printing of many PDF files when sent to black and white printers.

resolution-too-low Number of low-resolution color images present in the document. PDF Checker determines the resolution of each image in the PDF document, and any color image with a resolution below this Trigger DPI value is counted as a low-resolution image. This Trigger value parameter defaults to 150 DPI for color images.
resolution-too-high Number of high-resolution color images present in the document. Any color image PDF Checker finds with a resolution greater than this Trigger DPI value is counted as a high-resolution image. The Trigger value parameter defaults to 600 DPI for color images.
uses-jpeg2000-compression Number of color images in the document using JPEG2000 compression. JPEG compression is a compression format used for rendering photographs as image files. It is also known as DCT, Discrete Cosine Transform.
image-depth PDF Checker lists the number of 16-bit color images found in the PDF document. Image depth refers to the number of bits needed to store color for each pixel in a graphic. Color graphics are often 8-bit, but higher quality images are 16-bit or more. A color graphic with 16-bit image depth will usually render better on a screen or when printing, but the image is also a lot larger, making the PDF document a lot larger as well.
images:grayscale PDF Optimizer provides a variety of different methods for optimizing images that can reduce the size of grayscale images. These optimizations include the ability to downsample (reduce the resolution of) images and the ability to change the compression of grayscale images. These optimizations can be controlled specifically for grayscale images.
resolution-too-low Number of low-resolution grayscale images present in the document. PDF Checker determines the resolution of each image in the PDF document, and any grayscale image with a resolution below this Trigger DPI value is counted as a low-resolution image. This Trigger value parameter defaults to 150 DPI for grayscale images.
resolution-too-high Number of high-resolution grayscale images present in the document. Any grayscale image PDF Checker finds with a resolution greater than this Trigger DPI value is counted as a high-resolution image. This Trigger value parameter defaults to 600 DPI for grayscale images.
uses-jpeg2000-compression Number of grayscale images in the document using JPEG2000 compression. JPEG compression is a compression format used for rendering photographs as image files. It is also known as DCT, Discrete Cosine Transform.
images:monochrome PDF Optimizer provides a variety of different methods for optimizing images that can reduce the size of monochrome images. These optimizations include the ability to downsample (reduce the resolution of) images and the ability to change the compression of monochrome images. These optimizations can be controlled specifically for monochrome images.
resolution-too-low Number of low-resolution monochrome images present in the document. PDF Checker determines the resolution of each image in the PDF document, and any monochrome image with a resolution below this Trigger DPI value is counted as a low-resolution image. This Trigger value parameter defaults to 200 DPI for monochrome images.
resolution-too-high Number of high-resolution monochrome images present in the document. Any monochrome image PDF Checker finds with a resolution greater than this Trigger DPI value is counted as a high-resolution image. This Trigger value parameter defaults to 1200 DPI for monochrome images.
uses-jbig2-compression Number of monochrome images in the document using JBIG2 compression. JBIG2 is a compression algorithm designed for binary images, or images where each pixel can only have one of two possible colors. For PDF Checker JBIG2 is used for black and white images.
images: alternate-images A PDF document can be set up to specify alternate images, or multiple versions of one image within the same document. These images can be used to meet different needs. For example, a PDF could present one image with a lower resolution for display on a monitor, and an alternate image with a higher resolution to use when the PDF document is sent to a printer. These images can make a PDF document very large. Today alternate images are rarely used.

PDF Optimizer may be used to remove alternate images from PDF files. Removing alternate images can dramatically reduce the size of PDF files that contain these.