refining captcha with a little noise

469 Views Asked by At

I'm trying to crack a particular web CAPTCHA. I'm planning to do it by segmenting the characters and passing them to an ANN (mostly for features, I will be using method of moments as it seems difficult to completely remove noise completely)

The captcha is very noisy, and unfortunately there is no color difference between the noise and the actual text, so separation based on color will not work. After quite some thought, I managed to implement a flood-fill style algorithm on the pixels of the captcha to separate small disconnected components, and after this I ended up with something like this: CAPTCHA after considerable noise removal

Most of the noise is gone but some of it is left around the letters themselves (since it is touching the text). I'm not an expert on image filters, and I'm finding it very difficult to find the right filter to reduce the remaining noise and enhance the characters. Any Ideas on what filter(s) I could use for this purpose.

(Note: I'm not using any image manipulation tool/library for this. I'm writing raw pixel manipulation code, but I can implement most filters given their convolution kernel)

The problem is that due to this noise, it is becoming difficult to segment the characters. Clearly trying to find vertical lines with no dark pixels is not going to work, since there is noise and some of the letters are touching. Any ideas on how I could segment these efficiently?

EDIT: Original image original image of captcha

1

There are 1 best solutions below

3
On

what about trying morphological operators like closing and opening? they are very easy to implement and a simple but efficient tool.

After one closing with a 3x3 cross structuring element (kernel) and binarising the image the noise is almost gone:

enter image description here

I am sure just a bit more trying will render great results.

edit: to clear things up a little, the closing is a dilation followed by an erosion (other way around for opening). A dilation is assigning every pixel in your image the maximal value of all pixels in the kernel (structuring element) around it, conversly, the erosion assign every pixel the minimal value of all pixels in the kernel around it.

Also take a look at the wikipedia link and the external links in there.