OCR Performance guide


Excellent performance of the OCR component is one of the key factors for high customer satisfaction. This chapter provides information on general OCR performance factors and the possibilities to optimize them.

When measuring OCR performance, there are two major parameters to consider:


  • RECOGNITION ACCURACY
  • PROCESSING SPEED



Image Type and Image Quality


Images can come from different sources. Digitally created PDFs, screenshots of computer and tablet devices, image files created by scanners, fax servers, digital cameras or smartphones – various image sources will lead to different image types with different level of image quality. For example, using the wrong scanner settings can cause “noise” on the image, like random black dots or speckles, blurred and uneven letters, or skewed lines and shifted table borders. In terms of OCR, this is a ‘low-quality image’. Processing low-quality images requires high computing power, increases the overall processing time and deteriorates the recognition results.


Image quality = Key factor for OCR performance


On the other hand, processing ‘high-quality images’ without distortions reduces the processing time. Additionally, reading high-quality images leads to higher accuracy results. Therefore, it is recommended to use high-quality images for the OCR process.

Increase OCR speed and accuracy by enhancing the image quality.



How to Get High-Quality Images


TIPS FOR DOCUMENT SCANNING

  • Font Size
    • Documents printed in very small fonts should be scanned at higher resolutions. Use the following resolution for scanning:
      • 300 dpi for typical texts (printed in fonts of size 10 pt or larger)
      • 400-600 dpi for texts (fonts size 9 pt or smaller)
  • Print Quality
    • Poor quality document, such as old newspapers or books should be scanned in the grayscale mode. This mode retains more information about the letters in the scanned text.


TIPS FOR TAKING PHOTOS

  • Correct Lighting
    • Make sure that lighting is evenly distributed across the page and that there are no dark areas or shadows.
    • If possible, use a tripod. Position the lens parallel to the plane of the document and point it toward the center of the text.
    • Turn off the flash to avoid glare and sharp shadows on the page.
    • If the camera has “White Balance” option, use a white sheet of paper to set white balance. Otherwise, select the white balance mode, which best suits the current lighting conditions.
  • If There is Not Enough Light …
    • Select a greater aperture value
    •  Select a greater ISO value for sensitivity
    • Use manual focusing if the camera cannot lock the focus automatically


Image Quality Enhancement with OptimiDoc Server


If it is not possible to influence the image quality in advance, it is recommended to enhance it prior to the recognition step. In OptimiDoc Server, various powerful image preprocessing functions are available:


ADVANCED IMAGE PROCESSING FUNCTIONS


  • ABBYY Camera OCR technology
  • Auto-splitting of double-pages
  • Removal of stamps and written notes
  • Automated image de-skewing
  • Autodetection of page orientation and rotation
  • Image despeckling



Predefined Recognition Modes


Another possibility to influence the OCR performance is to use recognition modes designed for particular scenarios. OptimiDoc Server provides following predefined recognition modes:


NORMAL


FAST


Using this mode you will achieve the highest recognition accuracy.


This mode is highly recommended when recognizing content is going to be reused in other applications or tasks where high accuracy is critically important.




Usage of this mode increases processing speed up to 200-250%.


This mode is recommended when processing speed is of the primary importance, such as in high-volume document processing for archiving, content and document management systems.



Document Languages


OptimiDoc Server is capable of recognizing both mono- and multi-lingual (e.g. written in several languages) documents. It is very important to specify the correct recognition language, since an incorrectly specified language can significantly slow down the document processing and decrease the recognition quality. If the recognition language cannot be specified in advance, it is possible to use automatic language detection.

However, a high number of preselected recognition languages will reduce the processing speed. Therefore, it is not recommended to specify more than five recognition languages.


OptimiDoc Server OCR  - Speed Testing Results


The table presents the results of internal performance testing. Please be aware that testing results always depend on many factors, such as image quality, used recognition languages and other factors.


ONE-PAGE

PAGES/MINUTE


ONE MULTI-PAGE

PAGES/MINUTE


165




10



Technical Test Information


Intel® Core™ i5-4440 (3.10 GHz, 4 physical cores), 8 GB RAM, 4 processes running simultaneously.

The performance was tested on 300 documents in English, using the ‘DocumentArchiving_Speed’ predefined profile. In the scenarios „One-page documents“ and „One multi-page document“ the documents were exported as PDF format. * The text was extracted from pre-defined areas on one-page documents. No export to any file format was performed



System Resources


During the OCR process, a range of different algorithms are applied. They depend on image quality, document languages, layout complexity and number of pages in the document. Accordingly, such algorithms might require higher memory resources. It is recommended to set up the system in accordance with the outlined memory requirements to optimize the processing speed by allocating adequate system memory.


MEMORY REQUIREMENTS


Processing multi-page documents

minimum 1 GB RAM

recommended 1,5 GB RAM

Parallel processing

350 MB RAM x number of CPU cores + additional 450 MB RAM

Parallel processing of documents in Arabic, Chinese, Japanese or Korean

850 MB x number of CPU cores + 750MB RAM