How does an OCR work?

1. Pattern recognition.2. Feature detection.

Optical Character Recognition (OCR)

What is optical character recognition?

Optical character recognition is the process of using technology to distinguish printed or handwritten text characters inside digital images of physical documents, like scanned paper documents. Optical character recognition (OCR) is also called text recognition. While OCR can recognize words in the image being scanned, it cannot derive the meaning of those words.

It involves examining the text of a document and translating the characters into code that can later be used for the purpose of data processing.

An optical character recognition system will be made up of hardware as well as software that is utilized for converting physical documents into machine-readable text. Hardware like optical scanners or specialized circuit boards is used for copying or reading ext, while the software handles the advanced processing aspect.

The software used could also employ artificial intelligence to implement far more advanced methods of intelligent character recognition (ICR), such as identifying languages or styles of handwriting.

Optical character recognition is typically used for the purpose of turning hard copy legal documents or historical documents into PDFs. After that is done, users are able to edit, format, and search the document as though it was created with a word processor.

‍

What is optical character recognition used for?

Optical character recognition is also used as a business solution for the automation of data extraction from printed or written text from a scanned document or image file and then converting the text into a machine-readable form so that it can be used for data processing like editing or searching.

It is also used for indexing print material for search engines and even for deciphering documents into text that can be read aloud to visually impaired users.

Another important use for OCR is depositing checks electronically without any need for a bank teller. It can also be used to automatically recognize text like license numbers from the pictures of license plates captured by road traffic cameras.

Optical character recognition is also used to translate words from an image into other languages. It is also used for sorting letters for mail delivery.

Our client, Mall of the Emirates, used OCR technology in a rather interesting manner. We integrated OCR technology into their Engati chatbot, allowing them to use it to simplify their loyalty program.

When customers made purchases, they would be able to click pictures of their bill on their mobile phones and send them to the chatbot. The chatbot would use the OCR technology to understand the contents of the bill and assign loyalty points to the customers directly over the chatbot, making it unnecessary for the customer to download their app or visit their website, thus reducing the customer effort involved.

‍

Get your WhatsApp chatbot at just $5 a day

Start now

‍

How does an optical character recognition work?

Optical character recognition starts with processing the physical form of the document with a scanner. After all the pages are copied, the document is converted into a black & white or two-color version.

The scanned-in image or bitmap will then be analyzed for light and dark areas. The dark areas as classified as characters while the light areas are classified as background.

After that, the OCR system analyses and processes the dark areas further to identify characters or numbers. There are multiple techniques that can be used for this purpose, but they generally tend to involve analyzing a single character, word, or block of text at a time.

The system then identifies characters by employing either of the following algorithms:

Pattern recognition

This involves feeding the program samples of text in a range of fonts and formats and then comparing the characters in the image to identify them.

Feature detection

The program uses rules that apply to the features of specific characters or numbers to identify them in the image that is being processed. These features can be the number of curves, angled lines, etc. in the character. For example, the letter ‘V’ could be stored as two diagonal lines that meet at the bottom.

After characters are identified, the OCR system converts them into ASCII codes which computer systems can use to handle further manipulations. Users need to correct basic errors, proofread the document, and ensure that complex layouts are handled correctly before saving the document for future use.

‍

What are the types of optical character recognition?

There are two types of OCR. These are software based OCR and machine based OCR (or inline OCR). The core algorithms are very similar. But the technologies are used on different types of text and are tuned in rather different ways. Technically, both technologies are software-based, but it boils down to when the OCR occurs.

Inline OCR is carried out at scan time, and is usually done on objects going down an assembly line instead of being performed on documents. It is predominantly used in mail-room processing on high speed high volume scanners, or on manufacturing assembly lines.

PC-based OCR can work on the widest range of document types.and has the benefit of scalability. The downside of PC based OCR is that it's not as fast as in-line OCR.

‍

How accurate is optical character recognition?

At this point, OCR tools can go upwards of 99% accuracy in typewritten texts. In spite of this, since companies still have to make use of human intervention to detect errors, higher levels of accuracy are still desired.

The main areas on which OCR research are focused right now are handwriting recognition and cursive text recognition.

‍

What are the advantages of optical character recognition?

Here are the most significant advantages of optical character recognition:

OCR information can be processed very fast. Large quantities of text can be fed to the system quickly.
Paper-based information is converted into an electronic form. This makes it easier to store or send by mail.
It is more affordable than paying someone to manually enter the text data in your system.
It is faster than manually entering text information.
Advanced versions of OCR can even recreate tables, columns, and even reproduce sites.

‍