chromium-compact-language-detector Django

471 Views Asked by user1839132 At 07 June 2025 at 03:54

I am using the chromium-compact-language-detector to detect language but it is unable to detect Japanese in the string.

text = '1/15 HR Div.Q&CS Dept. 全体MTG 開催
1月15日(水)、赤溜オーディトリアムにてHR Div.Q&CS Dept.の全体MTGが開催されました。 ' 

cld.detect(smart_str(text), pickSummaryLanguage=True, removeWeakMatches=False)

output: ('ENGLISH', 'en', True, 11, [('ENGLISH', 'en', 100, 0.8103727714748784)])

Suggestions are appreciated.

Original Q&A

There are 1 best solutions below

Priyank Patel On 24 January 2014 at 08:18

You may need to first encode that japanese string as UTF8, eg. Try This :

import codecs
import cld
cld.detect(codecs.getencoder('UTF-8')(u'1/15 HR Div.Q&CS Dept. 全体MTG 開催1月15日(水)、赤溜オーディトリアムにてHR Div.Q&CS Dept.の全体MTGが開催されました。 ')[0])

I think cld can't detect japanese lang . New version of it is available called cld2 . Check here : https://code.google.com/p/cld2/wiki/CLD2FullVersion

chromium-compact-language-detector Django

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in DJANGO

Related Questions in CHROMIUM

Related Questions in LANGUAGE-DETECTION

Trending Questions

Popular # Hahtags

Popular Questions