Getting sense stems for nltk semcor corpus words

582 Views Asked by Mahesha999 At 03 September 2021 at 09:18

I was trying semcor corpus in nltk.

I found this code here:

>>> list(map(str, semcor.tagged_chunks(tag='both')[:3])) 
['(DT The)', "(Lemma('group.n.01.group') (NE (NNP Fulton County Grand Jury)))", "(Lemma('state.v.01.say') (VB said))"]

I tried the same on colab (check last cell in this notebook):

>>> list(map(str, semcor.tagged_chunks(tag='both')[:3]))
['(DT The)',
 '(group.n.01 (NE (NNP Fulton County Grand Jury)))',
 '(say.v.01 (VB said))']

Here is the screenshot from colab:

The problem

Note that on nltk page, for Fulton County Grand Jury output is given as Lemma('group.n.01.group'), but on colab, I am getting group.n.01. So I am not getting sense / synset lemma.

In group.n.01.group
- first group is a "stem for sense word"
- last group is "stem for input"
In group.n.01
- (first and only) group is "stem for input"
- no "stem for sense word" is returned

Weird thing is that it was giving me correct output yesterday. This notebook will clear the doubt as it has same two lines executed today and yesterday. Yesterday (2/9/2021), I was getting tags in format group.n.01.group, but today I am getting tags in group.n.01 format (NOTICE RED AND BLUE COMMENTS):

What I am missing here?

Original Q&A

There are 1 best solutions below

Mahesha999 On 05 September 2021 at 08:57 BEST ANSWER

I knew that semcor uses wordnet senses to tag to subset of brown corpus. But I was not aware that semcor APIs can work with or without wordnet predownloaded and it will give tags in different format in these different scenarios. I honestly feel, at least semcor API documentation should have some mention of this.

So, without wordnet predownloaded, it does not return sense stems:

With wordnet pre-downloaded, it does return sense stems:

Getting sense stems for nltk semcor corpus words

There are 1 best solutions below

Related Questions in NLP

Related Questions in NLTK

Related Questions in GOOGLE-COLABORATORY

Related Questions in NLTK-BOOK

Trending Questions

Popular # Hahtags

Popular Questions