I currently try to use tesseract.js in angular, to perform some recognition on images that have previously been modified in opencv.js.
Image manipulation via opencv.js is working really great now, but I can't figure whats wrong with my differents tries with tesseract.js...
When I follow some tutorials on the web, it works great and I can perform OCR on the default example image, for example (only the revelant part)
const exampleImage = 'https://tesseract.projectnaptha.com/img/eng_bw.png';
const worker = Tesseract.createWorker({
logger: m => console.log(m)
});
Tesseract.setLogging(true);
work();
async function work() {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
let result = await worker.detect(exampleImage);
console.log(result.data);
await worker.terminate();
}
But, when I try to do the same with a previously processed image (via opencv.js), with an cv.Mat() image, or via the resulting html canvas... I always get the same error:
tesseract.js error : TypeError: Cannot read property 'SetImage' of null
I also get this error : Error in pixReadMem: size < 12
I don't really understand what I'm doing wrong, and I believe that my error can be in the way I give the picture to tesseract... But every way that I've tried didn't work, so here I am to ask for your help.
Example of code not working :
const worker = Tesseract.createWorker({
logger: m => console.log(m)
});
Tesseract.setLogging(true);
work(onlyDocument);
async function work(d) {
await worker.load();
const ctx = document.getElementById('result').getContext('2d');
const buffer = ctx.getImageData(0, 0, ctx.canvas.width, ctx.canvas.height).data.buffer;
const result2 = await worker.detect(buffer);
console.log(result2.data);
await worker.terminate();
}
I must precise that every I tried every format that I could think to give that image to tesseract.js (buffer, the canvas, array, ...)
You would need to initialize the Tesseract API before performing any OCR tasks. This would resolve the following error.
Solution:
After initialization, as long as the input to API is image-like, it should work regardless of whether the image is pre-processed/ unprocessed. Hope this solves your query.
P.S.: The tutorial sample had the API initialized and hence no errors were thrown.