OpenCV matchTemplate accuracy

291 Views Asked by At

I am trying to get the locations of options in some images of questions using OpenCV matchTemplate. I tried using OCR with bounding boxes but it nearly takes 10 seconds to compute, so I decided to try matchTemplate. It was a lot faster but not very accurate. Here are my images and my code

const cv = require('@u4/opencv4nodejs');
const fs = require('fs');
const color = new cv.Vec3(255,255,255);


async function opencvGetPositions(imageData,path,answers){
const mat = cv.imdecode(imageData);
let modifiedMat = mat.cvtColor(cv.COLOR_RGB2GRAY);
modifiedMat = modifiedMat.threshold(0,255,cv.THRESH_OTSU);
modifiedMat = modifiedMat.bitwiseNot();

//answers is an array of cv.mat converted to grayscale

  for(let i = 0; i < answers.length;i++){
    const ww = answers[i].sizes[1];
    const hh = answers[i].sizes[0];
    const matched = modifiedMat.matchTemplate(answers[i],cv.TM_SQDIFF);
    const loc  = matched.minMaxLoc().minLoc;

    const xx = loc.x;
    const yy = loc.y;


    const pt1 = new cv.Point(xx,yy);
    const pt2 = new cv.Point(xx+ww, yy+hh);
    modifiedMat.drawRectangle(pt1,pt2,color,2);
  }

  cv.imwrite(path + '/output.png', modifiedMat);

}

I'm using nodejs and @u4/opencv4nodejs package

Answers array consists of these images:

enter image description here enter image description here enter image description here enter image description here enter image description here

Applied to:

enter image description here

This image is mostly accurate probably because I cropped the options from this one, I guess there are slight differences on each questions options so maybe that causes the inaccuracy.

enter image description here

But most of the images are very inaccurate like this one.

So is there a better way to do this or someway to make the matchTemplate function more accurate?

And here are the images without any modifications:

enter image description here

enter image description here

1

There are 1 best solutions below

3
BleedFeed On

Fixed it by standardizing the option images to fit all the questions and calculating matches at different scales in a loop. I found a way to calculate more than one matchTemplate() at different scales then get the best one

let found = {};
for (const scale of (linspace(0.7, 1.1, 30))) {
  const resizedQuestion = whiteSpacedMat.resize(0, 0, scale, scale);
  const scaleRatio = whiteSpacedMat.sizes[0] / resizedQuestion.sizes[0];
  if (resizedQuestion.sizes[0] < grayAnswer.sizes[0] || resizedQuestion.sizes[1] < grayAnswer.sizes[1]) {
    break;
  }
  const edged = resizedQuestion.canny(50, 200);
  const result = edged.matchTemplate(answerEdges,cv.TM_SQDIFF_NORMED);
  const {minVal,maxVal,minLoc,maxLoc} = cv.minMaxLoc(result);
  if (typeof found.val == 'undefined' || found.val > minVal) {
    found = {val:minVal,loc:minLoc,scale:scaleRatio}
  }
}

And now I am using canny images instead of thresholded binary ones don't know if that contributed to accuracy.

After the loop it was a lot better but still had some issues like confusing option B with E and there was still some inaccuracy. So i figured out that if i somehow standardize the spaces around options it will be easier to find them. I first added 30 pixels to the left of the image because option A) didn't have much space on the left.

mat.copyMakeBorder(0,0,30,0,cv.BORDER_CONSTANT,new cv.Vec3(255,255,255));

Then I added some space between text lines using this. And increased the spaces around template images. Now it does not recognize some text as a option because it is looking for text areas that have a lot of space around them.

enter image description here

and this is what the results look like enter image description here enter image description here