Getting following error on generating language scorer on Deepspeech

89 Views Asked by At

File "generate_scorer_package", line 1 SyntaxError: Non-UTF-8 code starting with '\xea' in file generate_scorer_package on line 2, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

1

There are 1 best solutions below

0
On

Before answering this question, I am going to make some assumptions:

  • Firstly, I believe you are following the DeepSpeech Playbook and are at the step in generating a kenlm.scorer file, as documented here

  • Secondly, I am going to assume that you are using a Python editor of some descrition, like PyCharm.

The error SyntaxError: Non-UTF-8 code starting with '\xea' in file generate_scorer_package on line 2, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details is not related to DeepSpeech; it is related the Python encoding of the file that is being executed.

Python 3 assumes that the encoding of the .py file is UTF-8; however some editors - particularly editors in other locales - can override this setting.

To force the file to UTF-8 encoding, add the following code to the top of the generate_scorer_package.py file:

# coding: utf8

NOTE: It MUST be at the top of the file

Alternatively, identify where in your editor the encoding is set, and change it.

See also these Stack Overflow questions that are similar: