Google Cloud Vision OCR misses single numbers and symbols

967 Views Asked by At

I am using the Google Cloud Vision API to detect text in receipts. In some cases not all text on the receipt is detected. Mainly short numbers, symbols and words are not detected.

An example of this problem can be found here, which is a Dutch receipt which was processed with the "Try the API" interface. As seen in the image, not all text is detected.

The image is according to the best practices guidelines as set in the documentation.

Is there a way to improve the image or to configure the API so that all text and symbols are detected? Any hints or help are much appreciated.

1

There are 1 best solutions below

1
On

This is one of the disadvantages of the google OCR - it misses single character and symbols quite often. You might get more single letters and symbols if you use the detection mode "TEXT_DETECTION" instead of "DOCUMENT_TEXT_DETECTION". But there is no guarantee to detect all single letters.

BTW: The ABBYY cloud OCR-API is better in doing so - but much more expensive.