Getting a box in my print return when using pytesseract.image_to_string

81 Views Asked by At

I am new to python coding so I have built this program based off of pieces of what I've found so bear with me if my code is messy. Basically, I am wanting to read the text off a Pokemon card and add the data into a csv file. I've seen a few examples across the internet of a similar application.

I have set the bounds of my ROI of and have gotten the command pytesseract.image_to_string to return the name on the card. However, when I print the output the name is on one line and then a box is on the next line. My best guess is it's reading the whitespace as an unknown character. Based off my research, tesseract 4.0 doesn't have the option to use whitelist and blacklist which could be the problem? Again I am a novice at this at best so any help would be appreciated. I've spent two days playing with this code trying to get my outputs on one line. I am using Thonny on the raspberry pi 3B if that's any needed info.

import cv2
from cv2 import dnn_superres
import numpy as np
import pytesseract
import imutils
from PIL import Image
import csv
import re

image = cv2.imread('/home/calanm92/Documents/Pictures/image1.jpg', 0)
#gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
#blur = cv2.GaussianBlur(gray, (3,3), 0)
#thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

#kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
#opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
#invert = 255 - opening

height, width = 900, 700
imageResize = cv2.resize(image,(height,width), interpolation=cv2.INTER_AREA)


####### For reading name of card
cv2.rectangle(imageResize, (330,50), (480,85), (255,0,0),2)
name_roi = imageResize[51:84, 331:479]
name_out = pytesseract.image_to_string(name_roi, config='--psm 7        tessedit_char_whitelist="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz/. "')

####### For reading set number
cv2.rectangle(imageResize, (305,570), (350,590), (255,0,0),3)
set_roi = imageResize[570:590, 300:355]
set_out = pytesseract.image_to_string(set_roi, lang='eng', config='--psm 7    tessedit_char_whitelist=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz/' '')


header = ['Name', 'Set Number', 'Price Low', 'Price High']
data = [name_out, set_out, '50', '100']

#open file in write mode
with open('/home/calanm92/Documents/Pictures/Pokemon.csv', 'w', encoding='UTF8') as f:
    #create the csv writer
    writer = csv.writer(f)



    #write a row to the csv file
    writer.writerow(header)
    writer.writerow(data)



print(name_out)
#print(set_out)
cv2.imshow('image', imageResize)
cv2.imshow("name_roi", name_roi)
cv2.imshow("set_roi", set_roi)
cv2.waitKey(0)
cv2.destroyAllWindows()

My output ends up as:

Wailord [] 032/159 []

As a result my csv file ends up as: Name, Set Number, Price Low, Price High "wailord FF", "032/159 FF", 50, 100

Pokemon Card Output error

0

There are 0 best solutions below