Removing curved line from captcha

734 Views Asked by At

I am trying to read a captcha generated by SimpleCaptcha:

captcha

I've managed to remove the gradient and color:

import cv2 as cv
import numpy as np
import PyDIP as dip
import pytesseract

img = cv.imread('capt.jpg')
img = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
img = cv.adaptiveThreshold(img,255,cv.ADAPTIVE_THRESH_GAUSSIAN_C,\
        cv.THRESH_BINARY,11,2)

captcha bw

However, I can't remove the curved line or fill the letters.

I have tried the code from there and got this:

lines = dip.PathOpening(img, length=400, mode={'constrained'})
img = img-lines
img = 255 - img 

No lines

lines = np.array(lines)

Detected lines

And with this other method I get:

# image is the previous img

gray = cv.cvtColor(image,cv.COLOR_BGR2GRAY)
thresh = cv.threshold(gray, 0, 255, cv.THRESH_BINARY_INV + cv.THRESH_OTSU)[1]

horizontal_kernel = cv.getStructuringElement(cv.MORPH_RECT, (25,1))
detected_lines = cv.morphologyEx(thresh, cv.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv.findContours(detected_lines, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    cv.drawContours(image, [c], -1, (255,255,255), 2)

repair_kernel = cv.getStructuringElement(cv.MORPH_RECT, (1,6))
result = 255 - cv.morphologyEx(255 - image, cv.MORPH_CLOSE, repair_kernel, iterations=1)

# results:

No lines

# detected_lines:

Detected lines

I'm trying to read the text with:

captcha_val=pytesseract.image_to_string(img)

This is the source of captcha. Any help?

0

There are 0 best solutions below