I have this image input image on which I am attempting to apply text detection and ocr, however even after preprocessing (binary thresholding etc) pytesseract doesn't return any output. The purpose of text detection is to improve the ocr output, I'm not too concerned with obtaining bounding boxes.
Here is my code below:
image = cv2.imread('image.jpg')
grey = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
ret,thresh1 = cv2.threshold(grey,127,255,cv2.THRESH_BINARY)
image = pytesseract.image_to_data(thresh1, output_type=Output.DICT)
image = cv2.bitwise_not(image)
Inspecting the results there is none to nonsensical output, is there anyway to improve this?
Try this code: