Javascript OCR tesseract.js Error in copying number after recognition

302 Views Asked by At

i'm working on this project the idea of it is that you give the program an image and using OCR in javascript the program well detect or (recognize) a string or a word for example ('رقم العداد') and copies the the number or the integer after the string with ( spaces ) like ==>>

7038842  رقم العداد

and that is it so i'm using Tesseract.js ( Tesseract.recognize ) to recognize the string but at the first i faced an Error

Uncaught (in promise)

Erorr so after beating around the bush its turned out that the tesseract fail to detect some Arabic letters as they are so i print all the text detected from the image and it turend out that the string ['نقطة الخدمة'] is recognized as ['ننطة الخدمة'] and ['رقم العداد'] as ['رم العداد'] so using

string.match method to maniplate and copy the number after the word the number was given for ['رم العداد'] was correct and clear but !!! for some reason the code is not copying the number written after the word ['ننطة الخدمة'] i tried to play around like adding spaces and tabs but the same problem is given so eventually i decieded to ask for some help so what is i'm missing

the code :-

<script>
        Tesseract.recognize(
        'form.png',
        'ara',
  { logger: m => console.log(m) }
).then(({ data: { text } }) => {
    console.log(text);
    const info = ['ننطة الخدمة','رم العداد','القراءة'];
    for(k=0;k<info.length;k++){
    var result = text.match(new RegExp(info[0] + '\\s+(\\w+)'))[1]; /* info[0] the index of ['نقطة الخدمة']*/
    alert(result);
}

})
    </script>

the project image:-

enter image description here

0

There are 0 best solutions below