Error using text2image Font Exocet Light failed with 223518 hits = 99.94% when trying to build image file using Diablo 2 font

188 Views Asked by At

I am running tesseract on windows 11 using the command prompt.

The text file is my training data. Words that I want to turn into images. The output is the next step in the Tesseract process for training my font. I am saying find fonts but I only have one font in the folder.

text2image --text="C:\PythonProjects\DiabloTesseractTrainFont\text.txt" --outputbase="C:\PythonProjects\DiabloTesseractTrainFont\Output\Dia.font.exp0" --fontconfig_tmpdir="C:\PythonProjects\DiabloTesseractTrainFont" --find_fonts --fonts_dir="C:\PythonProjects\DiabloTesseractTrainFont\Diablo Fonts"

The result: Total chars = 223645 Font Exocet Light failed with 223518 hits = 99.94%

Not sure why it fails. I have built something similar to this before. I have tried with a font file that I know has worked and it does the exact same thing.

Any help would be appreciated.

1

There are 1 best solutions below

0
On

I solved it. In the text file, there were some characters that had been changed when I read them into python. I believe they used to be bullet points but when I read the file I had implemented in python ASCII encoding and ignore errors. I figured that those characters would be removed. I was wrong. Those bullet points were replaced with text that said PAD. I found it in notepad++ and highlighted one of them and then replaced them with a space. Note in Notepad++ when I did the replace it did not have anything in the find field but it still replaced all of them. Now it compiles just fine. I was stuck for many hours I hope this helps someone.