how to use lucene-gosen analyser with lucene.net?

154 Views Asked by At

Please guide me how to use japanese analyser (lucene-gosen) with Lucene.net. And also suggest me some good analyzer for Lucene.net that support Japanese.

1

There are 1 best solutions below

1
On BEST ANSWER

The Lucene-Gosen analyzer does not appear to be ported to Lucene.Net. You can make a request on their github page or you could help them out by porting it and submitting a pull request.

Once that analyzer exists and using the article here - using their basic code, just change the analyzer:

string strIndexDir = @"D:\Index";
Lucene.Net.Store.Directory indexDir = Lucene.Net.Store.FSDirectory.Open(new System.IO.DirectoryInfo(strIndexDir));
Analyzer std = new JapaneseAnalyzer(Lucene.Net.Util.Version.LUCENE_29); //Version parameter is used for backward compatibility. Stop words can also be passed to avoid indexing certain words
IndexWriter idxw = new IndexWriter(indexDir, std, true, IndexWriter.MaxFieldLength.UNLIMITED);     

//Create an Index writer object.
Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document();
Lucene.Net.Documents.Field fldText = new Lucene.Net.Documents.Field("text", System.IO.File.ReadAllText(@"d:\test.txt"), Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.ANALYZED, Lucene.Net.Documents.Field.TermVector.YES);
doc.Add(fldText);

//write the document to the index
idxw.AddDocument(doc);

//optimize and close the writer
idxw.Optimize();
idxw.Close();
Response.Write("Indexing Done");