Poor Result in Detecting Meter Reading Using Pytesseract

552 Views Asked by Rashida At 14 December 2022 at 01:13

I am trying to develop a meter reading detection system. This is the picture

I need to get the meter reading 27599 as the output. I used this code:

import pytesseract
import cv2

image = cv2.imread('read2.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
(H, W) = gray.shape

rectKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20, 7))

gray = cv2.GaussianBlur(gray, (1, 3), 0)
blackhat = cv2.morphologyEx(gray, cv2.MORPH_BLACKHAT, rectKernel)

res = cv2.threshold(blackhat, 0, 255, cv2.THRESH_BINARY_INV  + cv2.THRESH_OTSU)[1]

pytesseract.image_to_string(res, config='--psm 12 --oem 3 digits')

I get this output:

'.\n\n-\n\n3\n\n7\n\n7\n\n3\n\n-2105 566.261586\n\n161200\n\n310010\n\n--\n\n.-\n\n.\n\n5\n\x0c'

This is my first OCR project. Any help will be appreciated.

Original Q&A

There are 1 best solutions below

M.Armoun On 14 December 2022 at 12:21 BEST ANSWER

Well, there are a lot of texts there that can be removed before we start reading the actual meter number. On the other hand, we can limit our OCR to just numbers in order to prevent false positives (As a few 7-segment numbers are like alphabetical letters).

Since tesseract is not working well enough on 7-segment numbers. I will use EasyOCR.

So the procedure would be like this:

There are large spaces around the actual counter which can be cropped.
we blur the image and run a Hough transform to get the circular meter.
We for sure know that the number is in the upper half of that circle so we again crop based on the center and radius of the detected circle.
the cropped image then can be fed to EasyOCR and as i said previously only limited to the English language and numbers.

import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt
import easyocr

cropped = orig_image[300:850,:][:,200:680]
cropped_height, cropped_width, _ = cropped.shape

gray = cv.cvtColor(cropped, cv.COLOR_BGR2GRAY)
blurred = cv.GaussianBlur(gray, (17,17),0)

minDist = 100
param1 = 30 
param2 = 50
minRadius = 100
maxRadius = 300

circle_img = cropped.copy()
circles = cv.HoughCircles(blurred, cv.HOUGH_GRADIENT, 1, minDist, param1=param1, param2=param2, minRadius=minRadius, maxRadius=maxRadius)
print(f"{len(circles)} circles detected", circles[0,:][0])
if circles is not None:
    circles = np.uint16(np.around(circles))
    for i in circles[0,:]:
        cv.circle(circle_img, (i[0], i[1]), i[2], (0, 255, 0), 2)

circle = circles[0,:][0]
circle_center = (circle[0], circle[1]) # x, y
circle_radius = circle[2]

color_cropped = cropped[circle_center[1] - circle_radius : circle_center[1],:]

reader = easyocr.Reader(['en'], gpu=False)
result = reader.readtext(color_cropped, allowlist ='0123456789')
if result:
    print("detected number: ", result[0][1])

detected number: 27599

Poor Result in Detecting Meter Reading Using Pytesseract

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in OPENCV

Related Questions in OCR

Related Questions in PYTHON-TESSERACT

Related Questions in METER

Trending Questions

Popular # Hahtags

Popular Questions