I am using the chromium-compact-language-detector to detect language but it is unable to detect Japanese in the string.
text = '1/15 HR Div.Q&CS Dept. 全体MTG 開催
1月15日(水)、赤溜オーディトリアムにてHR Div.Q&CS Dept.の全体MTGが開催されました。 '
cld.detect(smart_str(text), pickSummaryLanguage=True, removeWeakMatches=False)
output: ('ENGLISH', 'en', True, 11, [('ENGLISH', 'en', 100, 0.8103727714748784)])
Suggestions are appreciated.
You may need to first encode that japanese string as UTF8, eg. Try This :
I think
cld
can't detect japanese lang . New version of it is available calledcld2
. Check here :https://code.google.com/p/cld2/wiki/CLD2FullVersion