I am creating a website in pythonanywhere for OCR.In this user can upload text-images and download it in editable format. For english language it is working perfectly, but while i try to include some additional languages (south Indian languages) it showing some error messages.
i put my additional traineddata in folder "/home/wiltomalayalamocr/mysite/langfiles" it contains "mal.traineddata" file
and in my code
pytesseract.pytesseract.tesseract_cmd = r"/usr/bin/tesseract"
custom_oem_psm_config = '-l {} --psm {} --tessdata-dir "/home/wiltomalayalamocr/mysite/langfiles"'.format(lang,6)
text = pytesseract.image_to_string(Image.open(filename) , config=custom_oem_psm_config)
in which lang="mal" but i am getting the error
pytesseract.pytesseract.TesseractError: (1, 'Tesseract Open Source OCR Engine v3.04.01 with Leptonica Error opening data file /usr/share/tesseract-ocr/tessdata/mal.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language \'mal\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')
i am using python-Flask framework
Anybody can help me ....
At last searching and trying of 2 day i got the solution for this
setting an environment variable in bash console like below is not enough
it will not make effect on our app ,so what we need to do is setting the environment variable up on loading the app . so what i did is same as the below link tells
https://help.pythonanywhere.com/pages/environment-variables-for-web-apps/
my project directory is /home/wiltomalayalamocr/mysite my .env file contains the
export TESSDATA_PREFIX=/home/wiltomalayalamocr/mysite/langfiles
And in WSGI configuration file I added folowing line of code