I want to make a simple sentiment analysis on a Hebrew text using Polyglot in python 3.6. The problem is that Polyglot recognizes the text language code as "iw" and not as "he", and therefore is not able to process it.
As shown at:
use polyglot package for Named Entity Recognition in hebrew I've already added hint_language_code = 'he'
to the Text
function call, but it only changes the initial form of the text, not its sub-forms (like sentences or words).
For example:
Input:
import polyglot
from polyglot.text import Text, Word
article='איך ניתן לנתח טקסט בעברית? והאם ניתן לשנות את הקידוד?'
txt = Text(article)
print(txt.language.code)
txt = Text(article,hint_language_code = 'he')
print(txt.language.code)
sent=txt.sentences[1]
print(sent.language.code)
print(sent)
Output:
iw
he
iw
והאם ניתן לשנות את הקידוד?
How can I permanently change the text language_code
from 'iw'
to 'he'
?