PDF Alchemist

Getting Started

When you install PDF Alchemist, the product includes both a command line utility, as an executable file, and an API that you can use to build your own application to convert PDF documents to HTML, XML, or EPUB export files.

PDF Alchemist is provided for Windows and Linux.

The PDF Alchemist library file is used by both the command line utility and the API.

Linux

For Linux, the installation file is a compressed TAR file (.tgz).

Installation Directory Library file
PDF_Alchemist_Pro or PDF_Alchemist_Premium,
under the current working directory
libPDFAlchemist.so: A shared library containing the logic for processing PDF files

dltesseract3: the shared library for the Optical Character Recognition (OCR) tool provided with PDF Alchemist.

Windows

For Windows, the installation file is a self-extracting zip file.

Installation Directory Library file
PDF Alchemist Pro or PDF Alchemist Premium PDFAlchemist.dll: Stored in the root directory of the download files

PDFAlchemist.exe: Program executable file for the command line utility

dltesseract3.dll:  the shared library for the Optical Character Recognition (OCR) tool provided with PDF Alchemist.

Requirements

Linux Verified on Ubuntu 14.04 and RedHat Enterprise Linux 7. Other Linux versions using Linux kernel 3.2 or higher are also supported.
Windows Windows 7 or higher required

cmaps

The cmaps folder holds CMAP font files.

CMAP refers to a type of font called a Character Map. Some fonts in PDF files use predefined mappings between character encodings and specific, predefined character identifier sets. Specifically, CMAP files are used to map Unicode font characters to Chinese/Japanese/Korean (CJK) characters. These mappings (CMaps) are a standard part of the PDF specification. They are needed to properly process PDF documents that do not include embedded Unicode conversion settings but that do contain CJK language characters. Because of their large size, these character maps are stored in files that are provided with Acrobat and other Adobe Systems products, and are usually referenced by name from PDF files when needed.

SDK

This folder holds the files needed to integrate the PDF Alchemist SDK into your own application. It contains PDFAlchemist.h, the header file that declares the PDF Alchemist API, and PDFAlchemist.lib or PDFAlchemist.so, the link library for PDF Alchemist. Use this file for linking your program with PDF Alchemist if you are not dynamically loading the PDF Alchemist library at run time.

tessdata

This folder holds test and language files to support the Optical Character Recognition (OCR) tool.

Output

The default output is two files written to two folders. The HTML file generated is called page1.html, and the product also provides a cascading style sheet, stylesheet.css.

The folders named /fonts and /images contain extracted and generated fonts and images that are referenced by the output and CSS files. If you convert a PDF form file into an HTML form, PDF Alchemist will also write a style sheet file called AcroForm.css.

If your input PDF contains bookmarks, PDF Alchemist will write a file called bookmarks.html, that holds a set of links for bookmarks in the PDF document to corresponding sections in the HTML output file.

When you use PDF Alchemist to generate EPUB output, all of the necessary files will be stored within the EPUB file written to the output directory.

Windows Specific Issues

Run the vcredist_x64.exe installer program before running PDF Alchemist. The file is found in the root directory.

Note: Make sure to install the Visual Studio 2013 C++ runtime support files. If you don’t, you may see an error message about a missing DLL file called MSVCP120.dll.

Linux Specific Issue

If you are working with Linux, if you see this error while running PDF Alchemist:

./PDFAlchemist: error while loading shared libraries: libPDFAlchemist.so: cannot open shared object file: No such file or directory

Add the directory containing libPDFAlchemist.so to your LD_LIBRARY_PATH variable.

Additional Linux dependencies

In order to run PDF Alchemist, the following dependencies are needed. These versions are tested and are known to work with the product.

dependency description
libpng15.so.15 PNG library
libz.so.1 Assembly embedding library
libxml2.so.2 XML library
libpthread.so.0 Pthread library
libGL.so.1 OpenGL library
libGLU.so.1 OpenGL utility library
libstdc++.so.6 GNU standard C++ library
libm.so.6 AMD math library
libgcc_s.so.1 GNU Compiler Collection library
libc.so.6 Standard C library