PDF Checker

PDF Checker

PDF technology today

More than 25 years after the Portable Document Format was introduced, PDF documents continue to dominate the digital marketplace.  Some 2.5 trillion PDF files are produced each year. PDF documents are widely used to distribute financial statements, reports, articles, proposals, and other long documents that feature complex formatting. They appear all over the Internet and are extensively used by businesses and government agencies, frequently to replace printed documents for conveying and storing content. PDFs can be opened in any browser window, and their formatting is perfectly preserved when sent to a printer. PDF documents are also easy to secure, they can be used as electronic forms, and you can add digital signatures to PDF files, thus turning them into legal documents.

But this very popularity creates a problem in managing PDF documents.  PDF files can be generated using a dizzying array of software systems and platforms, including many that are out of date, as it is also common to find PDF files that are five, ten, even 15 years old. That means that it is sometimes hard to tell what is stored inside any given PDF file, and how those features—attached files, graphics, form fields, signatures, and so on—might affect how you can manage that PDF document or distribute it. Can you make a massive PDF document smaller, somehow, without compromising its quality? Can unnecessary features be removed so that the document will print faster or open faster in a browser window? What about fixing errors within the PDF itself?

Using PDF Checker

That’s what PDF Checker is for.  PDF Checker is a free, simple scriptable server tool for 64-bit Windows and Linux platforms that allows you to quickly scan a PDF document or set of documents to look for problems, or to simply identify features within a document that are likely to get in your way if you want to use the PDF efficiently.

Search the PDF Checker index

Then, with this knowledge of your PDF documents, you can use another Datalogics product, PDF Optimizer, to apply fixes and improvements to those documents so that they will download faster or open more quickly in a browser window or mobile device.

You can enter a PDF Checker statement from a command prompt, and manually check one PDF document at a time.  Or you could add a PDF Checker command statement to a batch file. With a batch file, you can create a workflow that uses PDF Checker to check large numbers of PDF documents automatically.  In this case you might need to write some JavaScript or Python code to automatically generate a series of command line statements, each with a unique name for the input PDF document, JSON profile, output report file, and if necessary, the password to open a PDF input document.

Installing PDF Optimizer with PDF Checker

PDF Checker can be used as an independent tool to assess the characteristics of PDF documents, but it was also designed to be used with PDF Optimizer to conditionally apply appropriate fixes and improvements.  For this reason, we provide a free trial version of PDF Optimizer with your PDF Checker installation.

If you install PDF Checker in Windows, you will see an installation screen called Select Additional Tasks.  Here you can check a box to Install PDF Optimizer. The box is checked by default, so PDF Optimizer will install with PDF Checker unless you decide otherwise.

If you install PDF Checker in Linux, you will be asked if you want to install a trial version of PDF Optimizer as well.  Type Y in response on the command line to install PDF Optimizer. You will be prompted to accept a license agreement, and then PDF Optimizer will be installed separately.

You will receive an activation key for your 14 day trial of PDF Optimizer via the email sent to the address that you provided. We provide further activation instructions for PDF Optimizer.

Using the JSON profile

The JSON profile is a file that defines search conditions for PDF Checker to use when scanning an input PDF document. JSON, or JavaScript Object Notation, is an open standard file format that relies on easily readable English text, and it is used as an alternative to XML. The custom settings that you provide to PDF Checker to manage the process are all defined in this profile. The name of the profile file must be included in the command line statement.

You could, for example, edit your JSON profile so that PDF Checker looks for unembedded fonts and flags any it finds in a report as an error or “information” statement.  Then the software generates a report listing the results. You can display the results as standard output from your Command Line tool, and either list the results on the screen or use a program or script to redirect the standard output to an external file. You can also use an option in the command that directs the PDF Checker results to a text file that you define.

We provide a default profile file (everything.json) with the product that includes every available search option. Feel free to edit this file or use it as a model for creating your own. You can name your profile file or files whatever you like, and store them wherever you want to. But the content of your profile must be valid JSON content, and we recommend that when you name a profile file you include the “.json” file extension.

If you want to create your own custom profile, we recommend that you save a copy of the default JSON file and rename it, and then edit this copy.  This way you will preserve the original JSON profile for later reference.  Also, if you install an updated version of the software, the installation process will overwrite the original file, and any changes you made to that file will be lost. Besides saving your own copy of the profile, you might also want to create a backup your edited JSON profile in a different directory.

You can use the JSON validator JSONLint to check your JSON syntax.

Note that the Windows installation program for PDF Checker adds the location of the PDF Checker executable to the PATH in the Windows Environment Variables, so you can run “pdfchecker.exe” from anywhere.  For Linux, if you want to run the executable from anywhere, you need to manually add the location of the PDF Checker executable to the PATH variable.