I am brazilian student and in pt-stackoverflow didn't find nothing about this. I'm a newbie in python and opencv, it's being hard to study about.
I'm trying to do an OCR program in python that can identify multiple lines and words with webcam provided video.
I'm trying with static images first for test and I've already tried with the code in opencv tutorials, like this, but only return 1 line or cover the words
# single line
if lines is not None:
for rho, theta in lines[0]:
a = np.cos(theta)
b = np.sin(theta)
x0 = a * rho
y0 = b * rho
x1 = int(x0 + 800 * (-b))
y1 = int(y0 + 800 * (a))
x2 = int(x0 - 800 * (-b))
y2 = int(y0 - 800 * (a))
cv2.line(cap, (x1, y1), (x2, y2), (0, 0, 255), 2)
cv2.imshow("windowName", cap)
# -
# demark text with multiple lines
# -
if True: # HoughLinesP
lines = cv.HoughLinesP(dst, 1, math.pi/180.0, 40, np.array([]), 30, 10)
a, b, c = lines.shape
for i in range(a):
cv.line(cdst, (lines[i][0][0], lines[i][0][1]), (lines[i][0][2], lines[i][0][3]), (0, 0, 255), 3, cv.LINE_AA)
else: # HoughLines
lines = cv.HoughLines(dst, 1, math.pi/180.0, 50, np.array([]), 0, 0)
if lines is not None:
a, b, c = lines.shape
for i in range(a):
rho = lines[i][0][0]
theta = lines[i][0][1]
a = math.cos(theta)
b = math.sin(theta)
x0, y0 = a*rho, b*rho
pt1 = (int(x0+1000*(-b)), int(y0+1000*(a)))
pt2 = (int(x0-1000*(-b)), int(y0-1000*(a)))
cv.line(cdst, pt1, pt2, (0, 0, 255), 3, cv.LINE_AA)
cv.imshow("detected lines", cdst)
In first part of code, i'll have only one line marked, and in second part have multiple lines, buth they are in front of words.
![1]: https://i.stack.imgur.com/Sm5FP.png "single line" ![2]: https://i.stack.imgur.com/IASaE.png "multiple lines"
I would like multiple lines and a mode to recognize the words in the line, as the example image below.
![3]: https://i.stack.imgur.com/RVafY.png "multiple lines" ![4]: https://i.stack.imgur.com/w0DG3.png "my objective"
Sorry for a big text, but I have no one to help me here, I'm two steps to give up.
Extra info: contours code
sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)[0])
for i, ctr in enumerate(sorted_ctrs):
x, y, w, h = cv2.boundingRect(ctr)
roi = image[y:y + h, x:x + w]
cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
if w > 15 and h > 15:
im = Image.fromarray(roi)
text = pytesseract.image_to_string(im)
print text
voiceEngine.say(text)
voiceEngine.runAndWait()