tesseract false confidence decision

3.8k Views Asked by At

I am using tesseract for serial number recognition. At tesseract there is the possibilty to recognize text at different kind of levels, like recognize single words, lines, paragraphs, characters. There is also the possibility to get the confidence of each of these levels. So I took a look at the confidence of each character in my serial number and noticed that tesseract often returns not the highest confidence character as the best choice. Has somebody experienced this situation too? Am I doing something wrong at recognition?

Example for such an sitation, correct serialnumber should be: OC2VRHT5 Take a look at the last character. Although "5" has a higher confidence, tesseract took "S" as the best choice.

**Tesseract output:**
symbol O, conf: 88.679855   - O conf: 88.679855
                            - 0 conf: 88.592140
                            - G conf: 77.554398
                            - C conf: 76.861900
                            - U conf: 75.981255
                            - Q conf: 75.135574
---------------------------------------------
symbol C, conf: 86.341553   - C conf: 86.341553
                            - Q conf: 71.356201
---------------------------------------------
symbol Z, conf: 77.400093   - 2 conf: 88.078430
                            - Z conf: 77.400093
---------------------------------------------
symbol V, conf: 93.404572   - V conf: 93.404572
---------------------------------------------
symbol R, conf: 93.212280   - R conf: 93.212280
---------------------------------------------
symbol H, conf: 84.634628   - H conf: 84.634628
                            - N conf: 75.782585
---------------------------------------------
symbol T, conf: 92.986008   - T conf: 92.986008
---------------------------------------------
symbol S, conf: 79.127983   - 5 conf: 84.440292
                            - S conf: 79.127983
                            - B conf: 78.667168
                            - G conf: 78.661667
---------------------------------------------

My implementation:

//Initializing tesseract
tesseract::TessBaseAPI tess;
tess.Init(NULL, "eng", tesseract::OEM_TESSERACT_ONLY);
tess.SetPageSegMode(tesseract::PSM_SINGLE_BLOCK);

tess.SetImage((uchar*) cropImage.data, cropImage.cols, cropImage.rows, 1,
            cropImage.cols);
tess.SetVariable("save_blob_choices", "T");
tess.Recognize(0);

char* out = tess.GetUTF8Text();
std::cout << out << std::endl; //=> OCZVRHTS 

tesseract::ResultIterator* ri = tess.GetIterator();
    tesseract::PageIteratorLevel level = tesseract::RIL_SYMBOL;

    if (ri != 0) {
        do {
            const char* symbol = ri->GetUTF8Text(level);
            float conf = ri->Confidence(level);
            if (symbol != 0) {
                printf("symbol %s, conf: %f", symbol, conf);
                bool indent = false;
                tesseract::ChoiceIterator ci(*ri);
                do {
                    if (indent)
                        printf("\t \t \t");
                    const char* choice = ci.GetUTF8Text();
                    printf("\t- %s conf: %f\n", choice, ci.Confidence());
                    indent = true;
                } while (ci.Next());
            }
            printf("---------------------------------------------\n");
            delete[] symbol;
        } while ((ri->Next(level)));
    }

EDIT

While I first thought that the answer of jaka-konda solved my problem, it seems that sometimes the results are better, but sometimes tesseract also doesn't take the highest confidence character. Further investigation a bigger database is needed, but it seems that the dictionary of tesseract is not completely disabled.

2

There are 2 best solutions below

4
On

While you are iterating per symbol, text recognition is still done based on entire gathered and dictionary. In your example is extremely low probability that a word will contain numbers in the middle thats whay they are replaced with option that has a higher probability (characters). To solved this I'd recommend decreasing dictionary impact values.

Try to set these variables to false:

load_system_dawg 
load_freq_dawg
load_punc_dawg
load_number_dawg
load_unambig_dawg
load_bigram_dawg
load_fixed_length_dawgs

Tesseract FAQ: How to increase the trust in/strength of the dictionary?

Code:

GenericVector<STRING> pars_vec;
pars_vec.push_back("load_system_dawg");
pars_vec.push_back("load_freq_dawg");
pars_vec.push_back("load_punc_dawg");
pars_vec.push_back("load_number_dawg");
pars_vec.push_back("load_unambig_dawg");
pars_vec.push_back("load_bigram_dawg");
pars_vec.push_back("load_fixed_length_dawgs");

GenericVector<STRING> pars_values;
pars_values.push_back("0");
pars_values.push_back("0");
pars_values.push_back("0");
pars_values.push_back("0");
pars_values.push_back("0");
pars_values.push_back("0");

tesseract::TessBaseAPI tess; // = new tesseract::TessBaseAPI();
tess.Init(NULL, "eng", tesseract::OEM_TESSERACT_ONLY, NULL, 0, &pars_vec,
            &pars_values, false);

Initializing tesseract with parametersTesseract-OCR API

0
On

I apologize for my late response. I have tried different ways and combinations to disable the dictionary of tesseract complete. In the end I finally disabled the dictionary in two different ways:

1. Initialize with variables Based on the answer of @Jaka Konda answer:

GenericVector<STRING> pars_vec;
    pars_vec.push_back("load_system_dawg");
    pars_vec.push_back("load_freq_dawg");
    pars_vec.push_back("load_punc_dawg");
    pars_vec.push_back("load_number_dawg");
    pars_vec.push_back("load_unambig_dawg");
    pars_vec.push_back("load_bigram_dawg");
    pars_vec.push_back("load_fixed_length_dawgs");

    GenericVector<STRING> pars_values;
    pars_values.push_back("F");
    pars_values.push_back("F");
    pars_values.push_back("F");
    pars_values.push_back("F");
    pars_values.push_back("F");
    pars_values.push_back("F");
    pars_values.push_back("F");


    tesseract::TessBaseAPI tess; // = new tesseract::TessBaseAPI();
    tess.Init(NULL, "eng", tesseract::OEM_TESSERACT_ONLY, NULL, 0, &pars_vec,
            &pars_values, false);

2. Using config file

Since I haven't found hardly any informations how to load a tesseract configuration file at initializing tesseract I want to provide this code.

char* a = "disableDictionary";
    char* hidden_array[1] = {a};
    char** argv = &hidden_array[0];
    tess.Init(NULL, "eng", tesseract::OEM_TESSERACT_ONLY, argv,
            1, NULL, NULL, false);

disableDictionary in /usr/share/tessdata/configs/

load_system_dawg    F
load_freq_dawg  F
load_punc_dawg  F
load_number_dawg    F
load_unambig_dawg   F
load_bigram_dawg    F
load_fixed_length_dawgs F

A temporally solution was to iterate over the confidence characters and choose the one with the highest confidence.

Further interesting was that the tesseract::ChoiceIterator broke, if the symbol was "" (empty). Therefore I adapted the if-condition from the original source code of the homepage APIExample to

if (symbol != 0 && strlen(symbol) != 0){...}