Google Vision API Text extraction data accuracy (DOCUMENT_TEXT_DETECTION)

1.5k Views Asked by At

Using JAVA SDK for GCP Vision API for OCR (text extraction). Have moved to Feature TEXT_DETECTION to DOCUMENT_TEXT_DETECTION. The image I have tested has the name “Mohan D”.

  • TEXT_DETECTION: I am getting a correct text, but not getting another character
  • If I am using DOCUMENT_TEXT_DETECTION, getting a name as “MOHAND (space is not coming)

Can you please suggest, whether I need to use any specific option to get more data accuracy

1

There are 1 best solutions below

3
On

The models used by Cloud Vision API service are always being improved in order to provide a better recognition accuracy; however, sometimes they get the characters wrong or even they don't recognize the characters themselves. Keep in mind these services are trained in a daily basis which means the recognition quality will increase accordingly.

Based on this, I think that the available workaround is to follow the files format/size and language recommendations, as well as implement the LanguageHints property that is commonly used when the service has difficulties detecting the language included in the image and that may help you to increase the OCR results accuracy.