3.3.21. How to Improve OCR Results and Reliability
We cannot assure your documents will return great and reliable results inside ChronoScan, there are just too many factors and variables involved. Our support team will try their best for you if you need but if you keep the guidelines in this article in mind and apply them to your documents where possible you will make sure you will be able to rely on ChronoScan's results. If you trust your documents after reading this article you will be able to trust ChronoScan for the job.
There are two areas we must look at If we are having trouble with the OCR results. We must look at the original document quality and at ChronoScan's image processing capabilities.
Image Quality
Let's first take a look at how we can improve our reliability on the documents that are going to be processed.
The document should be scanned and imported with a resolution between 300 and 400 DPI;
Grey scale images work better than black and white;
If you are having trouble with grey scale you can try color although color is usually only considered for exported documents like PDF;
The document should have sharp text and use a clear font, specially the text that is going to be captured;
We will want to avoid watermarks and background images, those only mess up the OCR engine and require a lot more processing for less than perfect results;
Document layout also plays a big role here, if it is in your power make sure the design of the documents is optimized for OCR
Basically we want to have the best source document possible. Sharp text, no image background or watermarks and high contrast between text and background. thus making document and data capture and processing faster and much more reliable.
Image Processing
ChronoScan have a nice collection of image processing tools to aid the data capture process. The downside of using them is that the overall document processing time will be increased and require more processing power. If it is known that some image processing will improve the OCR results for a specific set of documents then it can be applied during processing time.