Abbreviation Reference for NLTK Parts of Speech

3.1k Views Asked by TheGrimmScientist At 16 August 2025 at 20:20

I'm using nltk to find the parts of speech for each word in a sentence. It returns abbreviations that I both can't fully intuit and can't find good documentation for.

Running:

import nltk
sample = "There is no spoon."
tokenized_words = nltk.word_tokenize(sample)
tagged_words = nltk.pos_tag(tokenized_words)
print tagged_words

Returns:

[('There', 'EX'), ('is', 'VBZ'), ('no', 'DT'), ('spoon', 'NN'), ('.', '.')]

In the above example, I'm looking for what DT, EX, and the rest mean.

The best I have so far is to search for mentions of the abbreviations of concern in Natural Language Processing with Python, but there has to be something better. I did also find a few literature-based resources, but I don't know how to tell which nltk is using.

Original Q&A

There are 1 best solutions below

Mehdi On 17 June 2015 at 02:46 BEST ANSWER

The link that you have already mentioned has two different tagsets.

For tagset documentation, see nltk.help.upenn_tagset() and nltk.help.brown_tagset().

In this particular example, these tags are from Penn Treebank tagset.

You can also read about these tags by:

nltk.help.upenn_tagset('DT')
nltk.help.upenn_tagset('EX')

Abbreviation Reference for NLTK Parts of Speech

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in NLP

Related Questions in NLTK

Trending Questions

Popular # Hahtags

Popular Questions