how to extract text using opencv and pytesseract python?

I am using labelImg to draw a rectangle on the rows of image. Which gives me the xml file . With the help of this xml how to extract that text from the image table . To extract the text I have used the horizontal and vertical ine detection but do not get good result. Now I am using the labelImg which gives me the points of that text which want to extract but I do not know how to apply the method for this .Please tell me how to do that ?

My xml file :

      <folder>Test Images</folder>
      <path>/home/sumit/Desktop/office_works/Fusion_Code/BIS_Final/Test Images/FreKa.jpg</path>
         <name>Contact Type</name>

My input images :

Input images

how to extract the contract type from the table with the help of the xml file ? Thanks...


To get xmin you can use xpath() with '//annotation/object/bndbox/xmin' or even shorter '//xmin'

It always gives list (even if there is only one element or there are no elements) so it will need [0] to get first element or for-loop to work with all elements.

Using if list_of_elelemts: ... you can run code only when list has some elements.

You can also use len() to check how many elements you get.

text = '''
  <folder>Test Images</folder>
  <path>/home/sumit/Desktop/office_works/Fusion_Code/BIS_Final/Test Images/FreKa.jpg</path>
     <name>Contact Type</name>

import lxml.etree

tree = lxml.etree.fromstring(text)

print('xmin:', tree.xpath("//annotation/object/bndbox/xmin")[0].text)
print('xmin:', tree.xpath("//bndbox/xmin")[0].text)
print('xmin:', tree.xpath("//object//xmin")[0].text)
print('xmin:', tree.xpath("//xmin")[0].text)

print('xmin:', tree.xpath("//xmin/text()")[0])  # with `text()` instead of `.text`

for item in tree.xpath("//xmin/text()"):
    print('xmin:', item)  # with `text()` instead of `.text`

objects = tree.xpath("//object")
print('len(objects):', len(objects))

other = tree.xpath("//bndbox/other")
if other:
    print('found', len(other), 'elements')
    print('there is no "other" elements')