I have prepared the following ground truth files:
../tesstrain/data/Chechen-ground-truth
|-- 1.box
|-- 1.gt.txt
|-- 1.png
|-- 10.box
|-- 10.gt.txt
|-- 10.png
|-- 11.box
|-- 11.gt.txt
|-- 11.png
|-- 12.box
|-- 12.gt.txt
|-- 12.png
The box files are based on WordStr, here is the content of the file 1.box
for example:
WordStr 65 61 1556 254 0 #НЕКЪАШ А
65 61 1556 254 0
In the file 1.gt.txt
I then have the corresponding text:
НЕКЪАШ А
And here is the image:
Running the command make training MODEL_NAME=Chechen START_MODEL=rus TESSDATA=../tesseract/tessdata
, gives me an Error:
set -x; \
tesseract "data/Chechen-ground-truth/1.png" data/Chechen-ground-truth/1 --psm 13 lstm.train
+ tesseract data/Chechen-ground-truth/1.png data/Chechen-ground-truth/1 --psm 13 lstm.train
Bad box coordinates in boxfile string! 65 61 1556 254 0
No block overlapping textline: НЕКЪАШ А
Failed to read pages from data/Chechen-ground-truth/1.png
Error during processing.
make: *** [Makefile:258: data/Chechen-ground-truth/1.lstmf] Error 1
Tesseract v5.3.0
Built from the source by following the instructions: https://youtu.be/veJt3U44yqc