Tesseract is not finding text in simple handwriting test. Is there any way to fix this?

37 Views Asked by Dov At 17 August 2025 at 18:22

I am trying to put together a better solution for automated grading of paper tests. The problem is to extract rectangular areas from a test and do OCR on handwritten input. While handwriting is obviously challenging, this problem is significantly simpler than generically reading handwriting:

The text orientation is known
I can specify exactly what answers I am expecting, and/or the set of characters that are legal.
I would be willing to get a probability from the engine and if the probability is too low, call in a human to adjudicate (preferably not).

Tesseract claims to work on handwriting, works on linux and windows using mingw, so it seemed good.

I extracted a sample of handwritten data from a form. Here is the sample:

In this case, the bounds of the rectangle have not been cropped out, but I expected that it would be able to find my 64. It failed.

When I cropped the bounding box, it worked.

While in this case, I can solve the problem, I wanted to know whether there is anything I can do to improve recognition, because the bounding box seemed innocuous, and I am worried that any trivial noise could ruin detection.

Is there a better open source package I could use?
Is there is a way to improve the training for my application? I think I could create a "language" for single letters, and a different language for integers, and load multiple tesseract engines, each specialized for a kind of question type.
Is there a way in the internal API to give it a list of the potential strings/character set, ie hinting to improve accuracy?

Original Q&A

Tesseract is not finding text in simple handwriting test. Is there any way to fix this?

There are 0 best solutions below

Related Questions in MACHINE-LEARNING

Related Questions in OCR

Related Questions in HANDWRITING-RECOGNITION

Trending Questions

Popular # Hahtags

Popular Questions