Elasticsearch/Kuromoji: How to Use Kuromoji with Unidic

316 Views Asked by At

Elasticsearch 1.7

We would like to test Kuromoji with Unidic on Elasticsearch. Compiling kuromoji gives me a few jars with different dictinaries.

Is there a simple way to replace the ipadic-based-kuromoji with the unidic-based-kuromoji?

Thanks.

2

There are 2 best solutions below

0
johtani On

tokosh

I think it is no simple way to replace.

See : https://issues.apache.org/jira/browse/LUCENE-4056 Lucene Kuromoji still has open issue about it.

0
tokosh On

I ended up using the cmecab-java project as a guide to implement an Elasticsearch wrapper to use unidic-kuromoji (from here). There are older commits from the cmecab-java project which contain lucene plugin wrappers which need to be adjusted for an Elasticsearch-plugin.