Trying Tesseract on Windows CMD

6k Views Asked by At

I'm having trouble using Tesseract-OCR with the pytesseract Python wrapper. I figured that the problem might come from Tesseract itself, not from the wrapper. So I tried Tesseract in CMD :

C:\Users\Thomas\Desktop>tesseract.exe 'blabla.jpg' 'out.txt'

And it returned the following lines :

Tesseract Open Source OCR Engine v3.05.01 with Leptonica
Error in fopenReadStream: file not found
Error in findFileFormat: image file not found
Error during processing.

I've done the following to install Tesseract :

And by the way, the problem I'm having where running my Python code :

from PIL import Image
import pytesseract
text = pytesseract.image_to_string(Image.open('blabla.jpg')
print(text)

is :

Traceback (most recent call last):

  File "<ipython-input-1-01e77f902509>", line 1, in <module>
runfile('D:/anaconda/projects/OCR/ocr.py', wdir='D:/anaconda/projects/OCR')

  File "D:\anaconda\lib\site-packages\spyder\utils\site\sitecustomize.py", line 880, in runfile
execfile(filename, namespace)

  File "D:\anaconda\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

  File "D:/anaconda/projects/OCR/ocr.py", line 48, in <module>
text = pytesseract.image_to_string(a)

  File "D:\anaconda\lib\site-packages\pytesseract\pytesseract.py", line 122, in image_to_string
config=config)

  File "D:\anaconda\lib\site-packages\pytesseract\pytesseract.py", line 46, in run_tesseract
proc = subprocess.Popen(command, stderr=subprocess.PIPE)

  File "D:\anaconda\lib\subprocess.py", line 707, in __init__
restore_signals, start_new_session)

  File "D:\anaconda\lib\subprocess.py", line 990, in _execute_child
startupinfo)

PermissionError: [WinError 5] Access refused

Running the code as Administrator doesn't solve the problem

Thanks a lot for your help !

1

There are 1 best solutions below

2
On

Firstly, to verify tesseract works or not from Windows command prompt, use " " instead of ' ' if the image and/or output file name consists of space. Otherwise quote symbol is not needed.

C:\Users\Thomas\Desktop>tesseract.exe blabla.jpg out.txt

Secondly, use full file path to specifc the image file. Such as,

pytesseract.pytesseract.tesseract_cmd = 'C:/path/to/tesseract.exe'
text = pytesseract.image_to_string(Image.open('D:/path/to/blabla.jpg'))

Note that forward slash / is used to specific any file path instead of backslash \ , or you use double backslash \\, e.g. 'D:\\path\\to\\blabla.jpg'.

Hope this help.