Tesseract training error: Bad box coordinates in boxfile string

142 Views Asked by At

I have prepared the following ground truth files:

../tesstrain/data/Chechen-ground-truth
|-- 1.box
|-- 1.gt.txt
|-- 1.png
|-- 10.box
|-- 10.gt.txt
|-- 10.png
|-- 11.box
|-- 11.gt.txt
|-- 11.png
|-- 12.box
|-- 12.gt.txt
|-- 12.png

The box files are based on WordStr, here is the content of the file 1.box for example:

WordStr 65 61 1556 254  0   #НЕКЪАШ А
    65 61 1556 254  0

In the file 1.gt.txt I then have the corresponding text:

НЕКЪАШ А

And here is the image:

image

Running the command make training MODEL_NAME=Chechen START_MODEL=rus TESSDATA=../tesseract/tessdata, gives me an Error:

set -x; \
tesseract "data/Chechen-ground-truth/1.png" data/Chechen-ground-truth/1 --psm 13 lstm.train
+ tesseract data/Chechen-ground-truth/1.png data/Chechen-ground-truth/1 --psm 13 lstm.train
Bad box coordinates in boxfile string!  65 61 1556 254  0
No block overlapping textline: НЕКЪАШ А
Failed to read pages from data/Chechen-ground-truth/1.png
Error during processing.
make: *** [Makefile:258: data/Chechen-ground-truth/1.lstmf] Error 1

Tesseract v5.3.0

Built from the source by following the instructions: https://youtu.be/veJt3U44yqc

0

There are 0 best solutions below