Lucene librairy for purify a text (plurial, verbs...)

43 Views Asked by At

I would like some help to use Lucene in my Java App to simplify a text.

I already did it myself but I don't have solution for verbs and plurial.

How can I process ?

1

There are 1 best solutions below

0
On

If I understand your question correctly, you want to detect nouns/verb from a text. AFAIK Lucene on its own does not have capability to detect this. You can instead look at OpenNLP library which is a

machine learning based toolkit for the processing of natural language text

So it would be using concepts like training models and then predicting. It has a POSTagger API (part of speech tagger) - you can take a look at it's usage here in docs and some detailed examples here,here and here.

Another excellent framework in Java is Stanford Core NLP You can take a look at Stanford Log-linear Part-Of-Speech Tagger here