More than 25 years after the Portable Document Format was introduced, PDF documents continue to dominate the digital marketplace.  Some 2.5 trillion PDF files are produced each year. PDF documents are widely used to distribute financial statements, reports, articles, proposals, and other long documents that feature complex formatting. They appear all over the Internet and are extensively used by businesses and government agencies, frequently to replace printed documents for conveying and storing content. PDFs can be opened in any browser window, and their formatting is perfectly preserved when sent to a printer. PDF documents are also easy to secure, they can be used as electronic forms, and you can add digital signatures to PDF files, thus turning them into legal documents.

But this very popularity creates a problem in managing PDF documents.  PDF files can be generated using a dizzying array of software systems and platforms, including many that are out of date, as it is also common to find PDF files that are five, ten, even 15 years old. That means that it is sometimes hard to tell what is stored inside any given PDF file, and how those features—attached files, graphics, form fields, signatures, and so on—might affect how you can manage that PDF document or distribute it. Can you make a massive PDF document smaller, somehow, without compromising its quality? Can unnecessary features be removed so that the document will print faster or open faster in a browser window? What about fixing errors within the PDF itself?

That’s what PDF CHECKER is for.  PDF CHECKER is a simple scriptable server tool for 64-bit Windows and Linux platforms that allows you to quickly scan a PDF document or set of documents to look for problems, or to simply identify features within a document that are likely to get in your way if you want to use the PDF efficiently.

Then, with this knowledge of your PDF documents, you can use another Datalogics product, PDF OPTIMIZER, to apply these changes to those documents so that they will download faster or open more quickly in a browser window or mobile device.

You can enter a PDF CHECKER statement from a command prompt, and manually check one PDF document at a time.  Or you could add a PDF CHECKER command statement to a batch file. With a batch file, you can create a workflow that uses PDF CHECKER to check large numbers of PDF documents automatically.  In this case you might need to write some JavaScript or Python code to automatically generate a series of command line statements, each with a unique name for the input PDF document, JSON profile, output report file, and if necessary, the password to open a PDF input document.

The JSON profile is a file that defines search conditions for PDF CHECKER to use when scanning an input PDF document. JSON, or JavaScript Object Notation, is an open standard file format that relies on easily readable English text, and it is used as an alternative to XML. The custom settings that you provide to PDF CHECKER to manage the process are all defined in this profile. The name of the profile file must be included in the command line statement.

You could, for example, edit your JSON profile so that PDF CHECKER looks for unembedded fonts and flags any it finds in a report as an error or “information” statement.  Then the software generates a report listing the results. You can display the results as standard output from your Command Line tool, and either list the results on the screen or use a program or script to redirect the standard output to an external file. You can also use an option in the command that directs the PDF CHECKER results to a text file that you define.

We provide a default profile file (everything.json ) with the product that includes every available search option. Feel free to edit this file or use it as a model for creating your own. You can name your profile file or files whatever you like, and store them wherever you want to. But the content of your profile must be valid JSON content, and we recommend that when you name a profile file you include the “.json” file extension.

You can use the JSON validator JSONint to check your JSON syntax.