Improve OCR of image without scaling (using PIL, pixbuf)?

1.1k Views Asked by George J At 16 December 2015 at 19:54

I'm trying to make OCR-recognition on a screenshot, after screenshot taken (of desktop's region, on which you clicked) it goes to pibxbuffer, which content goes to pytesseract. But after using pixbuffer image quality is bad: it's skew (I tried to save it in a directory, instead of pixbuffer, and looked at it).

def takeScreenshot(self, x, y, width = 150, height = 30): 
    self.width=width 
    self.height=height 
    window = Gdk.get_default_root_window() 
    #x, y, width, height = window.get_geometry() 

    #print("The size of the root window is {} x {}".format(width, height)) 

    # get_from_drawable() was deprecated. See: 
    # https://developer.gnome.org/gtk3/stable/ch24s02.html#id-1.6.3.4.7 
    pixbufObj = Gdk.pixbuf_get_from_window(window, x, y, width, height) 
    height = pixbufObj.get_height() 
    width = pixbufObj.get_width() 
    image = Image.frombuffer("RGB", (width, height), 
                             pixbufObj.get_pixels(), 'raw', 'RGB', 0, 1) 
    image = image.resize((width*20,height*20), Image.ANTIALIAS) 
    #image.save("saved.png") 
    print(pytesseract.image_to_string(image)) 

    print("takenScreenshot:",x,y)

When I saved image to a directory it was ok (quality) and recognition was good.
Tried without Image.ANTIALIAS - makes no difference.

(Purpose of scaling by 20: I tried code which recognized image saved in a directory, without scaling quality of recognition was bad.)

The bad picture

THE PROBLEM IS THAT IMAGE IS SKEWED.

Original Q&A

There are 2 best solutions below

whunterknight On 16 December 2015 at 20:08

Such extreme scaling is generally bad for OCR, particularly in full color and with special processing (antialiasing)

I would:

upscale less (none?), or use NEAREST
convert to grayscale immediately after loading (to avoid the artifacts you're seeing):
```
image = image.convert('L')
```

Shubham Vasaikar On 20 February 2017 at 06:20

I don't know if you're still looking for a solution, but i ran into the same problem of the image being skewed. This is some kind of padding issue with GdkPixBuf. Basically, height and width of the image should always be divisible by 8. So this is what I do before taking the screenshot:

width = width + (8 - (width % 8))
height = height + (8 - (height % 8))

The screenshot should work after doing this.

You can read more about the issue here

Improve OCR of image without scaling (using PIL, pixbuf)?

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in PYTHON-3.X

Related Questions in PYTHON-IMAGING-LIBRARY

Related Questions in GDKPIXBUF

Related Questions in PIXBUF

Trending Questions

Popular # Hahtags

Popular Questions