3.3.8. PDF Data Extraction

The PDF Data Extraction capabilities of ChronoScan make it very easy to bulk index PDF files without user intervention.

ChronoScan can import your data from the PDF file. It allows you to easily create a configuration to index your file with the fields you want and export your data to txt, csv, Excel, Word, Html, and Ole/ODBC Databases.

An example: Larry receives around a hundred invoices every week from a provider and needs to extract the Invoice Number, Purchase Order, Order Date and Total Amount. Unfortunately his provider is in Japan and cannot provide Larry this data directly. Larry has two options: manual data entry or doing it with ChronoScan.

Here are the steps one by one:

1. Create a new Job Configuration with the Fields needed;

2. Import the PDF Files in a Batch using this Job;

3. Configure extraction areas from the first PDF in a new Document Type;

4. Apply Document Type to all the other documents;

5. Export the data to your favorite flavour using the Execute Output Window;

6. Execute step 2, 4 and 5 every week in a batch file with ChronoBulkMode.

You can refer to the following video:

http://youtu.be/XjGafp3vf34