Optimizing Content

Comments

Datalogics::PDFL::PDFOptimizerDiscardComments

Adobe Reader and Adobe Acrobat allow users to add annotations to PDF document pages, including content such as notes, highlighted text, images, and callout boxes. You can also add annotations in the form of crossed out text and attached files. These annotations may or may not appear when the document is rendered.

The DiscardComments option removes Markup annotations, used to place notes on PDF pages as comments on the content. Note that if this content is removed from a PDF document it cannot be recovered.

Default Value PDFOptimizerDiscardComments OFF

Annotations

Datalogics::PDFL::PDFOptimizerDiscardAnnotations

Use DiscardAnnotations to remove all annotations from a PDF document page, including comments, cross-outs, highlighting, and other edits. This option also removes links, Acroform fields that are filled in with default form information, and special effects for the page, such as movie and audio clips, and deletes any attached files that were added to the PDF document. Use this option carefully, as you won’t be able to recover annotations for a PDF document after they are removed.

Default Value PDFOptimizerDiscardAnnotations OFF

Acroforms

Datalogics::PDFL::PDFOptimizerDiscardAcroforms

AcroForm, or Acrobat Form, is the original PDF forms technology. You can use DiscardAcroforms to remove the Acroforms Dictionary from a PDF document, and thus any form fields that appear. The option also removes any Acroform annotations that are not referenced in a page.

As long as the PDF form document has been saved in Adobe Acrobat or Adobe Reader, if the form fields have been filled out, the values entered in the fields will be preserved in the document even if the Acroform fields themselves are removed. The form field values will continue to appear on the page, but it will no longer be possible to edit that content.

Default Value PDFOptimizerDiscardAcroforms ON

Embedded Files

Datalogics::PDFL::PDFOptimizerDiscardFileAttachments

This option remove any files attached to the PDF document, or files attached to annotations within the document. If a PDF has attached files, this option can make the document significantly smaller. The option changes the appearance of a page when an attachment is made to an annotation within a page.

Default Value PDFOptimizerDiscardFileAttachments ON

XMP Padding

Datalogics::PDFL::PDFOptimizerDiscardXMPPadding

XMP refers to the Extensible Metadata Platform, a standard created by Adobe Systems to guide the creation, processing, and exchange of metadata for a variety of digital resources. When XMP metadata is included in a PDF document, the application that creates the PDF leaves a “padding area” in the text stream, commonly several KB for each set of XMP metadata. This allows for the metadata to be edited in place, and expanded if needed, without disturbing the document as a whole. The DiscardXMPPadding option removes this padding from the document.

Default Value PDFOptimizerDiscardXMPPadding ON

Metadata Streams

Datalogics::PDFL::PDFOptimizerDiscardMetadata

This option removes all metadata streams within a PDF document (except for the document metadata). These metadata streams are generally inserted by applications working with the PDF document to allow those applications to provide additional information about specific constructs within the PDF document.

Default Value PDFOptimizerDiscardMetadata OFF

Document Information

Datalogics::PDFL::PDFOptimizerDiscardDocumentInfo

This option removes all of the Document Information found in a PDF document.

The basic information related to the PDF document is recreated automatically when the document is written. So this option effectively only removes additional information, other than the minimum set of values required for any PDF document by the PDF format. The DiscardDocumentInfo option decreases both the size of the document information dictionary and the document level metadata.

Default Value PDFOptimizerDiscardDocumentInfo OFF

Duplicate Form XObjects

Datalogics::PDFL::PDFOptimizerDiscardDuplicateForm

Some documents hold multiple copies of the same Form XObject. When this option is enabled, these duplicate forms are discarded, and only a single copy of the Form XObject remains.

A Form XObject is a PDF content stream that is a self-contained description of any sequence of graphics objects (including path objects, text objects, and sampled images). Form XObjects are defined in the Resources object in the PDF document; they can be named and have their own Resources, like fonts and images. And like images, Form XObjects can be reused repeatedly in the same document.

For more detail, see Section 8.10, “Form XObjects,” in the ISO 32000 Reference, page 217.

Default Value PDFOptimizerDiscardDuplicateForm ON

Unused Form XObjects

Datalogics::PDFL::PDFOptimizerDiscardUnusedForms

Sometimes a PDF document may contain Form XObjects that are not displayed on a page. You can use this object to remove these XObjects.

Default Value PDFOptimizerDiscardUnusedForms ON