Adobe PDF Library

Exporting Images from PDF Files


View Sample Code

This sample program reads the pages of the PDF file that you provide and extracts images that it finds on each page and saves those images to external graphics files, one each for TIF, JPG, PNG, GIF, and BMP.  To be more specific, the program examines the content stream for image elements and exports those image objects. If a page in a PDF file has three images, the program will create three sets of graphics files for those three images.  The sample program ignores text, parsing the PDF syntax to identify any raster or vector images found on every page.

Also, this program will create two sets of TIF files.  The sample program generates a single TIF file for each image found in the original PDF file.  It also generates at least one multi-page TIF file that includes all of the images in the original PDF.

This line of code creates the multi-page TIF file:

Ic.Save(“ImageExport-page” + pgno + “out-.tif”, ImageType.TIFF);

This creates a TIF file with a single image:

If  exporttype == ImageType.TIFF)

  isp = new ImageSaveParams();
  isp.Compression = CompressiongCode.LZW;“ImageExport-out” + next + “.tif”, exporttype, isp);

The ImageExport program also ignores any vector images. If you include a Windows Metafile (WMF) clip art image in a document and then turn that document into a PDF, the export program will not find and convert the WMF image into a separate graphics file.

To convert an entire page of a PDF document to a bitmap, see the DoctoImages  sample program.


View Sample Code

This sample program searches through the PDF file that you select and identifies raster drawings, diagrams and photographs among the text.  Then, it extracts these images from the PDF file and copies them to a separate set of graphics files in the same directory. Vector images, such as clip art, will not be exported.

The program defines a default PDF input file name. You can enter your own file name in the program itself or provide a PDF file name as part of a command prompt.  The program generates a series of BMP graphics files, created based on images drawn from the PDF file. You should see two matching sets of files.  The “Extract” files are direct copies of the original graphics images, while for the “Extracted” files, the graphics resolution is increased by 500 percent.  As a result the second set of “Extracted” bitmap files are much larger than the other BMP files.  When you open these files you will see that the images are considerably larger as well.


View Sample Code

Run this program to extract content from a PDF file from a stream.  The program defines a default PDF file to use for input. You can enter your own PDF file name in the program code if you like, or enter a PDF file name in a command prompt.  The program generates two output PDF files, with a message on each page, “This PDF was opened from a Stream.”

StreamIO is similar to the ImageFromStream sample program, in that StreamIO searches for streams within the PDF file and then extracts the content.  A stream is a string of bytes of any length, embedded in a PDF document with a dictionary that is used to interpret the values in the stream.  A stream can hold anything, but in this example a stream holds characters that describe a digital image in binary terms.

StreamIO is also distinct in that it shows how to use a pair of C# and Java programming functions to work with streams, FileStream and MemoryStream.  The FileStream draws information from a stream in a PDF file directory, and the MemoryStream stores the stream values in system memory.  These two functions allow you to draw information from a PDF file and store it in memory without actually opening the PDF file in Adobe Reader or Acrobat.


View Sample Code

This program is only available as a Java program.  The compatible program in C# is ImagefromStream.

A Byte Array is a contiguous section of memory, expressed as raw data, a series of bytes.

This sample program offers an alternate method for drawing a graphic image from a PDF; rather than exporting the embedded image itself, this program interprets a description of the image from a Byte Array and uses that description to create the image as a graphics file.

When you run the program it will generate a single PDF as an output file.


View Sample Code

If you are sending a color drawing or photograph or diagram from a workstation to a printer, or saving it from a scanner to a graphics file, or downloading it from a digital camera to a computer system so that you can edit it or insert it in a document, color management becomes important.  You want the colors in the image to remain the same regardless of the hardware or software you are using, and you want the image to look the same on the screen as it does on paper.  To reach this goal the International Color Consortium (ICC) was formed by a group of digital content vendors in 1993 to create a universal color standard that would work transparently across all operating systems and software packages. The ICC standard is usually built into the software that drives the printer, scanner, camera, or computer hardware; if a device does not have an operating system (like an inexpensive scanner or printer) the color standard is defined in the software used to edit the graphics file, such as Adobe InDesign or Adobe Photoshop.  This way the color standard applies regardless of the hardware involved.

The ImageEmbedICCProfile sample program demonstrates how to embed an ICC color profile in a graphics file. The program sets up how the output will be rendered and generates a TIF image file or series of TIF files as output.

The sample program defines a default PDF input file and the default name of the profile file.  You can change the program to include the name of your own PDF source file, or you can enter this in a command prompt.

The profile file name (profilename) refers to the name of the ICC color profile, or a file that describes a color space.  A hardware device or software product would use a color profile to interpret the colors in a graphic and translate them so that they can be presented accurately, at the level of the RGB (Red/Green/Blue) or CYMK (Cyan/Yellow/ Magenta/Black) values in each individual pixel.  Our sample program uses this profile to convert colors from the graphics found in the source PDF file to an export TIF image file.  But the profile could also be used to convert colors from an image created by a scanner to a file that can be displayed on a monitor, or send a JPG file taken as a photo on a digital camera to a printer.

The profile file name ends with the “.icc” suffix, as in:


We assume that you have your own profile file for use with color conversions, but we also provide a set of sample ICC files with the Adobe PDF Library.  The sample program is set up to default to one of these ICC files.  When you run the program, it will use this ICC file:


This file is found in the input files directory, APDFL15.0.4/Resources/Sample_Input.

If you want to configure the program to work with a different ICC profile file, manually enter the name of the icc profile you want in the “String profilename” setting in

The program will generate a series of TIF image files as output, abs, per, rel, and sat:

Abs Absolute Colorimetric.  Graphic artists often use the absolute method in working with drawings and designs, where they need to select an exact color.  It is also used in proofing.  In this case colors are converted in absolute terms, in that a given color is always changed by selecting a defined match. This method does not use a conversion algorithm to select the closest color available, and thus is useful for matching to an exact specified color, such as IBM Blue.
Per Perceptual Colorimetric. Generally used for photography, this method does not map colors one for one, but makes a sort of “best guess” to match colors.  Hence it often provides the most pleasing result, but not necessarily the most accurate.
Rel Relative Colorimetric.  Generally used for photography, the relative method seeks to select the closest possible color map, but using an algorithm.  The goal is to be true to the specified color.  This is the default method used in most systems.
Sat Saturation.  Commonly used in charts and diagrams, with a limited palette of colors but colors that need to be intense, and where the hue is not as important.


View Sample Code

This sample shows how to rasterize a page from a PDF document and save that page as an image file. The sample demonstrates using the PageImageParams object with the GetImage and GetBitmap methods.

The program creates three images:

  1. An output image with a pixel width and resolution that you select. In the example, the default values are 400 pixels wide and 300 DPI resolution. The actual size of the image in KB or MB is determined based on these two settings.
  2. An output image half the physical size of a PDF page at a specific resolution. So if the original page is 8.5 x 11 inches, the size of a JPG or PNG output file would be 4.25 x 5.5 inches. The method used in this sample, CreatePageImageBasedOnPhysicalSize, does not provide an input for the physical size of the output image file. Rather, you can enter a scale value (defaults to .5) and a resolution (defaults to 96 DPI). The example show how to calculate the width, in pixels, of the resulting graphics image file.
  3. An output image file with content drawn from an unrotated PDF page, but that contains only the top half of the original page.