PDF Alchemist

Getting Started

When you install PDF Alchemist, the product includes both a command line utility, as an executable file, and an API that you can use to build your own application to convert PDF documents to HTML, XML, or EPUB export files.

PDF Alchemist is provided for Windows and Linux.

The PDF Alchemist library file is used by both the command line utility and the API.

Linux

For Linux, the installation file is a self extracting .bsx script file. Copy the PDF Alchemist delivery package to your preferred location and run the script. It will unpack into a subdirectory called PDFAlchemist_Premium from the current working directory.

Installation Directory Library file
PDF_Alchemist_Premium,
under the current working directory
libPDFAlchemist.so

dltesseract3

A shared library containing the logic for processing PDF files

The shared library for the Optical Character Recognition
(OCR) tool provided with PDF Alchemist

Windows

For Windows, the installation file is a self-extracting zip file.

Installation Directory Library file
PDF Alchemist Premium PDFAlchemist.dll

PDFAlchemist.exe

dltesseract3.dll

Stored in the root directory of the download files

Program executable file for the command line utility

The shared library for the Optical Character Recognition
(OCR) tool provided with PDF Alchemist.

Requirements

Linux Verified on Ubuntu 14.04 and RedHat Enterprise Linux 7.

Other Linux versions using Linux kernel 3.2 or higher are also supported.

Windows Windows 7 or higher required

Software installation package

cmaps

The cmaps folder holds CMAP font files.

CMAP refers to a type of font called a Character Map. Some fonts in PDF files use predefined mappings between character encodings and specific, predefined character identifier sets. Specifically, CMAP files are used to map Unicode font characters to Chinese/Japanese/Korean (CJK) characters. These mappings (CMaps) are a standard part of the PDF specification. They are needed to properly process PDF documents that do not include embedded Unicode conversion settings but that do contain CJK language characters. Because of their large size, these character maps are stored in files that are provided with Acrobat and other Adobe Systems products, and are usually referenced by name from PDF files when needed.

SDK

The SDK folder holds the files needed to integrate the PDF Alchemist SDK into your own application. It contains PDFAlchemist.h, the header file that declares the PDF Alchemist API, and PDFAlchemist.lib or PDFAlchemist.so, the link library for PDF Alchemist. Use this file for linking your program with PDF Alchemist if you are not dynamically loading the PDF Alchemist library at run time.

Linux installations also include a header file called PDFAlchemist_Version.h, which is used for version tracking.

tessdata

This folder holds test and language files to support the Optical Character Recognition (OCR) tool.

Output

The default output is two files written to two folders. The HTML file generated is called page1.html, and the product also provides a cascading style sheet, stylesheet.css.

The folders named /fonts and /images contain extracted and generated fonts and images that are referenced by the output and CSS files. If you convert a PDF form file into an HTML form, PDF Alchemist will also write a style sheet file called AcroForm.css.

If your input PDF contains bookmarks, PDF Alchemist will write a file called bookmarks.html, that holds a set of links for bookmarks in the PDF document to corresponding sections in the HTML output file.

When you use PDF Alchemist to generate EPUB output, all of the necessary files will be stored within the EPUB file written to the output directory.

Additional Linux dependencies

In order to run PDF Alchemist, the following dependencies are needed. These versions are tested and are known to work with the product.

Dependency Description
libpng15.so.15 PNG library
libz.so.1 Assembly embedding library
libxml2.so.2 XML library
libpthread.so.0 Pthread library
libGL.so.1 OpenGL library
libGLU.so.1 OpenGL utility library
libstdc++.so.6 GNU standard C++ library
libm.so.6 AMD math library
libgcc_s.so.1 GNU Compiler Collection library
libc.so.6 Standard C library