read text from picture getting lot of color shade

120 Views Asked by At

Ciao,

How can I exctract text data from this picture : Picture form which I want to export text data

I have not so much experience on data post processing, since few days I try to extract text data from picture below with opencv library with python.

The perfect output from my phyton script would be :

42 Gandalf le Gris 2,247/2,300 2,035/2,200 1,068/1,100 274,232/285,800
35 Gorbag 7/100 355/1,250 37,530/207,500

The order doesn't really matter.

I tried severals codes with different parameters to obtain a result but I'm not sur to follow the good way (espacially for numbers).

  1. Increase contrast of the picture :
lab= cv2.cvtColor(image, cv2.COLOR_BGR2LAB)
l_channel, a, b = cv2.split(lab)

clahe = cv2.createCLAHE(clipLimit=10.0, tileGridSize=(6,6))
cl = clahe.apply(l_channel)

limg = cv2.merge((cl,a,b))

image = cv2.cvtColor(limg, cv2.COLOR_LAB2BGR)
  1. Use edge detection with different value:
for a in range(1000):
    i +=3
    image = cv2.Canny(image_1, 100 + i, 100 + i)
    data = pytesseract.image_to_string(image, lang='eng', config='--psm 6')
  1. Previewsly create a table with BGR color of all pixel I consider usefull and replace them with opencv by unique white color (it take some time to processing) to make text export easier :
for color in colors:
    rgb = color.split(',')
    image[np.all(image == (int(rgb[2]), int(rgb[1]), int(rgb[0])), axis=-1)] = (255, 255, 255)
  1. Convert image to grayscale and invert :
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (1,1), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Morph open to remove noise and invert image
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,1))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=0)
invert = 255 - opening

# Perform text extraction
image = pytesseract.image_to_string(image, lang='eng', config='--psm 6')
kernel = pytesseract.image_to_string(kernel, lang='eng', config='--psm 6')
invert = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')

Both of those codes (combined and used with different parameters) cannot output a good result. I think mains points are :

  • The thickness of the edge numbers are to much thin
  • The color of the numbers are to much close from the background color

Do you think it is possible ?

1

There are 1 best solutions below

0
On

I have read your query, I would recommend you to use a text detection model, with text angle classification and then after that you can extract the text using OCR. The text detection will only consider the part of image where there's text. So if you apply the image enhancement on that specific detected text, you may get good results.

I would also recommend you to use PaddleOCR. I have done inference on your image, using the text detection, angle classification and text recognition models and the results seems to be promising

Text Extraction Result on your attached image: OCR Result