Improve OCR of image without scaling (using PIL, pixbuf)?

1.1k Views Asked by At

I'm trying to make OCR-recognition on a screenshot, after screenshot taken (of desktop's region, on which you clicked) it goes to pibxbuffer, which content goes to pytesseract. But after using pixbuffer image quality is bad: it's skew (I tried to save it in a directory, instead of pixbuffer, and looked at it).

def takeScreenshot(self, x, y, width = 150, height = 30): 
    self.width=width 
    self.height=height 
    window = Gdk.get_default_root_window() 
    #x, y, width, height = window.get_geometry() 

    #print("The size of the root window is {} x {}".format(width, height)) 

    # get_from_drawable() was deprecated. See: 
    # https://developer.gnome.org/gtk3/stable/ch24s02.html#id-1.6.3.4.7 
    pixbufObj = Gdk.pixbuf_get_from_window(window, x, y, width, height) 
    height = pixbufObj.get_height() 
    width = pixbufObj.get_width() 
    image = Image.frombuffer("RGB", (width, height), 
                             pixbufObj.get_pixels(), 'raw', 'RGB', 0, 1) 
    image = image.resize((width*20,height*20), Image.ANTIALIAS) 
    #image.save("saved.png") 
    print(pytesseract.image_to_string(image)) 

    print("takenScreenshot:",x,y) 

When I saved image to a directory it was ok (quality) and recognition was good.
Tried without Image.ANTIALIAS - makes no difference.

(Purpose of scaling by 20: I tried code which recognized image saved in a directory, without scaling quality of recognition was bad.)

The bad picture

THE PROBLEM IS THAT IMAGE IS SKEWED.

2

There are 2 best solutions below

4
whunterknight On

Such extreme scaling is generally bad for OCR, particularly in full color and with special processing (antialiasing)

I would:

  • upscale less (none?), or use NEAREST
  • convert to grayscale immediately after loading (to avoid the artifacts you're seeing):

    image = image.convert('L')
    
0
Shubham Vasaikar On

I don't know if you're still looking for a solution, but i ran into the same problem of the image being skewed. This is some kind of padding issue with GdkPixBuf. Basically, height and width of the image should always be divisible by 8. So this is what I do before taking the screenshot:

width = width + (8 - (width % 8))
height = height + (8 - (height % 8))

The screenshot should work after doing this.

You can read more about the issue here