I am new to Natural language processing. Can anyone tell me what are the trained models in either OpenNLP or Stanford CoreNLP? While coding in java using apache openNLP package, we always have to include some trained models (found here http://opennlp.sourceforge.net/models-1.5/ ). What are they?
What are trained models in NLP?
709 Views Asked by Abdallah Sayed At
2
There are 2 best solutions below
0
Ganesh Krishnan
On
Think of trained model as a "wise brain with existing information".
When you start out machine learning, the brain for your model is clean and empty. You can either download trained model or you can train your own model (like teaching a child)
Usually you only train models for edge cases else you download "Trained models" and get to work in predicting/machine learning.
Related Questions in JAVA
- Add image to JCheckBoxMenuItem
- How to access invisible Unordered List element with Selenium WebDriver using Java
- Inheritance in Java, apparent type vs actual type
- Java catch the ball Game
- Access objects variable & method by name
- GridBagLayout is displaying JTextField and JTextArea as short, vertical lines
- Perform a task each interval
- Compound classes stored in an array are not accessible in selenium java
- How to avoid concurrent access to a resource?
- Why does processing goes slower on implementing try catch block in java?
- Redirect inside java interceptor
- Push toolbar content below statusbar
- Animation in Java on top of JPanel
- JPA - How to query with a LIKE operator in combination with an AttributeConverter
- Java Assign a Value to an array cell
Related Questions in NLP
- command line parameter in word2vec
- Annotator dependencies: UIMA Type Capabilities?
- term frequency over time: how to plot +200 graphs in one plot with Python/pandas/matplotlib?
- Stanford Entity Recognizer (caseless) in Python Nltk
- How to interpret scikit's learn confusion matrix and classification report?
- Detect (predefined) topics in natural text
- Amazon Machine Learning for sentiment analysis
- How to Train an Input File containing lines of text in NLTK Python
- What exactly is the difference between AnalysisEngine and CAS Consumer?
- keywords in NEGATIVE Sentiment using sentiment Analysis(stanfordNLP)
- MaxEnt classifier implementation in java for linguistic features?
- Are word-vector orientations universal?
- Stanford Parser - Factored model and PCFG
- Training a Custom Model using Java Code - Stanford NER
- Topic or Tag suggestion algorithm
Related Questions in STANFORD-NLP
- "Other" Class in Stanford NLP Classifier for lines that are not related to ANY of the Trained Classes
- Tokenization by Stanford parser is slow?
- Get list of annotators in Stanford CoreNLP
- keywords in NEGATIVE Sentiment using sentiment Analysis(stanfordNLP)
- Can I use the Stanford-nlp ner project to parse names of different formats?
- Extending Stanford NER terms with new terms
- Stanford Parser - Factored model and PCFG
- Training a Custom Model using Java Code - Stanford NER
- Minipar to stanford NLP dependencies
- Lazy parsing with Stanford CoreNLP to get sentiment only of specific sentences
- to search for numeric or alphanumeric strings before or after some keywords in java
- Separately tokenizing and pos-tagging with CoreNLP
- Chinese sentence segmenter with Stanford coreNLP
- How to extract an unlabelled/untyped dependency tree from a TreeAnnotation using Stanford CoreNLP?
- How to use crossValidate of stanford classifier?
Related Questions in OPENNLP
- How to set OpenNLP Model binary file path in Eclipse?
- How to use OpenNLP to get POS tags in R?
- Determine what tree bank type can come next
- What are trained models in NLP?
- Detecting the start and end of dialog sections in prose
- How can I get the noun phrase without space as input sentence after parsing with parseLine in OpenNLP?
- Can I train the OpenNLP NER in a way that would consider POS tags?
- Using the Tokenizer in openNLP
- How to train chunker in OpenNLP to Predict sequence of Words
- How to speed up the model creation process of OpenNLP
- how to detect dates with openNLP
- Create our own model for training openNLP and use it in java
- How to correct OutofMemoryError while using Stanford OpenNLP engine?
- OpenNLP not working with SOLR
- traning OPenNLP error
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
A "model" as downloadable for OpenNLP is a set of data representing a set of probability distributions used for predicting the structure you want (e.g. part-of-speech tags) from the input you supply (in the case of OpenNLP, typically text files).
Given that natural language is context-sensitive†, this model is used in lieu of a rule-based system because it generally works better than the latter for a number of reasons which I won't expound here for the sake of brevity. For example, as you already mentioned, the token perfect could be either a verb (
VB) or an adjective (JJ) and this can only be disambiguated in context:DT NN VBZ JJDT NN VBZ VBHowever, according to a model which accurately represents ("correct") English§, the probability of example 1 is greater than of example 2:
P([DT, NN, VBZ, JJ] | ["This", "answer", "is", "perfect"]) > P([DT, NN, VBZ, VB] | ["This", "answer", "is", "perfect"])†In reality, this is quite contentious, but I stress here that I'm talking about natural language as a whole (including semantics/pragmatics/etc.) and not just about natural-language syntax, which (in the case of English, at least) is considered by some to be context-free.
‡When analyzing language in a data-driven manner, in fact any combination of POS tags is "possible", but, given a sample of "correct" contemporary English with little noise, tag assignments which native speakers would judge to be "wrong" should have an extremely low probability of occurrence.
§In practice, this means a model trained on a large, diverse corpus of (contemporary) English (or some other target domain you want to analyze) with appropriate tuning parameters (If I want to be even more precise, this footnote could easily be multiple paragraphs long).