Adobe PDF Library

Redacting Text from a PDF Document


Sometimes documents containing private, sensitive or classified information must be edited before they are published or distributed. The editing needs to be done so that the original form of the content remains intact but some words or text are deliberately blacked out, or redacted. Use the Redactions sample program to search through a PDF document and find and obscure words that need to be kept hidden.

This sample opens an input PDF, searches for specific words using the Adobe PDF Library PDFWordFinder, and then removes these words from the text. The Adobe Acrobat PDWordFinder object can identify all of the words in a PDF document and create a list or table of those words, including the pages where each word appears, and the place of these words on each page. The PDFWordFinder finds the locations of each word as a Quad. The quad represents a physical four-point rectangle in the document.

The sample removes the words “rain” and “cloudy” from the document.  For the word “cloudy,” the sample shows how to change the display details of the redaction, such as changing the default color of the redacted box from black to red.

The sample defines three optional PDF documents, one input and two output. The text on the input document is redacted, and then the program saves this input document. One of the output documents is saved with the redacted values applied, and the other is saved without the redacted values.