Based on my previous question, I have a LCD screen that I need to read off of. The image that is use is
The code takes the chunks of pre determined regions and sends it to pytesseract to read. Most of the text on the box itself is pretty readable and so is the screen when other things are shown. The image that gets sent to pytesseract is this:
To me this image seems pretty normal and should be read by OCR but this is the output I get
RREEE4
My code for this is:
import cv2
import pytesseract
import numpy as np
import time
image = cv2.imread("test_python_screen3.png")
fname = "test_screen3.txt"
custom_config2 = r"--psm 7 --oem 3 -c tessedit_char_whitelist='GSCTEPVNR#.0123456789 '"
kernel = np.ones((4, 4), np.uint8)
def crop_image(image, y_start, y_end, x_start, x_end):
cropped_image = image[y_start:y_end, x_start:x_end]
return (cropped_image)
def grayscale(image):
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
return gray_image
def brightness_contrast(image, a, b):
#alpha = contrast, beta = brightness
result = cv2.convertScaleAbs(image, alpha=a, beta=b)
return result
def thresholding(image):
bw_image = cv2.threshold(image, 120, 255, cv2.THRESH_BINARY)[1]
return bw_image
def rescale(image, scale_percent):
width = int(image.shape[1] * scale_percent / 100)
height = int(image.shape[0] * scale_percent / 100)
dim = (width, height)
image_rescaled = cv2.resize(image, dim, interpolation = cv2.INTER_AREA)
return image_rescaled
out = crop_image(image, 285,415,410,920)
gray = grayscale(out)
bc = brightness_contrast(gray, a = 2.4, b = 50)
thr = ~thresholding(bc) #invert image for dilation since dilation works on black text w/ white BG
img_dilation = cv2.dilate(thr, kernel, iterations=1)
img_resized = ~rescale(img_dilation, 60) #invert again to get back to original
cv2.imshow("2", img_resized)
cv2.imwrite("4.jpg", img_resized)
results = pytesseract.image_to_string(img_resized, config = custom_config2)
results2 = results.encode('latin-1', 'replace').decode('latin-1')
print (results2)
with open(fname, 'a') as f:
f.writelines("Section2 \n")
f.writelines(results2)
I Have tried different --psm values yielding to no better outcome
Changing -c tessedit_char_whitelist
to just numbers gives the output: 4
, this region is supposed to show other words that is why the whitelist has some other characters aswell.
Since I working on a Raspberry Pi, using EasyOCR in CPU only mode takes way too long and is not viable?
Is there any way to fix this with just code or would I have to retrain the image recognition model?
I think this is a reason of how python handles digits:
Output: