I am trying to leematize a sentence using Stanford NLP, using the following code
import java.util.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.ling.CoreAnnotations.*;
import edu.stanford.nlp.util.CoreMap;
public class Entry
{
public static void main(String[] args)
{
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma");
StanfordCoreNLP pipeline;
pipeline = new StanfordCoreNLP(props, false);
String text = "This is my text";
Annotation document = pipeline.process(text);
for(CoreMap sentence: document.get(SentencesAnnotation.class))
{
for(CoreLabel token: sentence.get(TokensAnnotation.class))
{
String word = token.get(TextAnnotation.class);
String lemma = token.get(LemmaAnnotation.class);
System.out.println("lemmatized version :" + lemma);
}
}
}
}
The compiler is throwing me the following error
Exception in thread "main" java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException : Unrecoverable error while loading a tagger model
at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:558)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:267)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
at Entry.main(Entry.java:14)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by : edu.stanford.nlp.io.RuntimeIOException : Unrecoverable error while loading a tagger model
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:763)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:294)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:259)
at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:97)
at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:77)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:556)
... 9 more
Caused by : java.io.IOException : Unable to resolve "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as either class path, filename or URL
at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:446)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:758)
... 14 more
I downloaded the jar files of the CoreNLP library and I'm using Idea IntelliJ. Not being able to understand the error, Any idea anyone?
You are likely missing the *models.jar. Unless memory is a major concern, I've found that compiling the full corenlp package into a single jar (models included) is the easiest way to deploy StanfordNLP.
The foolproof way is to clone the git repo and build the jar:
Then, anytime you compile and run make sure you include the resulting jar:
EDIT: It occurs to me that the easiest way is likely to build with maven and include Stanford NLP as a dependency, but if you're looking for a lighter weight option, then this works.