Why isn't WSD matching WordNet?

253 Views Asked by At

I'm getting to grips with WSD and WordNet and I'm trying to work out why they are outputting different results. My understanding when using the below code is that the disambiguate command nominates the most likely Synset:

from pywsd import disambiguate
from nltk.corpus import wordnet as wn

mysent = 'I went to have a drink in a bar'

wsd = disambiguate(mysent)

Which gives me the below output

('I', None)
('went', Synset('travel.v.01'))
('to', None)
('have', None)
('a', None)
('drink', Synset('swallow.n.02'))
('in', None)
('a', None)
('bar', Synset('barroom.n.01'))

From this, I find it odd that the word 'I' was returned as 'nonetype' given that when looking up the word in WordNet I get one of four possible interpretations. Surely, 'I' should correspond to at least one of them?

wordnet.synsets('I')

Out:
[Synset('iodine.n.01'), Synset('one.n.01'), Synset('i.n.03'), Synset('one.s.01')]
1

There are 1 best solutions below

0
On BEST ANSWER

In your sentence above, 'I' is a pronoun. The wordnet FAQ states that:

Q: Why is WordNet missing: of, an, the, and, about, above, because, etc.

A: WordNet only contains "open-class words": nouns, verbs, adjectives, and adverbs. Thus, excluded words include determiners, prepositions, pronouns, conjunctions, and particles.