Loading treebank corpus with brown's tagset

463 Views Asked by At

I have a WSJ treebank corpus from nltk. I want to load it with the tagset of brown corpus. Is it possible?

import nltk
wsj = nltk.corpus.treebank.tagged_sents(tagset ='universal') # universal tags
wsj2 = nltk.corpus.treebank.tagged_sents() # treebank specific tags
1

There are 1 best solutions below

0
On

According to the discussion in this thread it is not possible.

So far NLTK only provides the possibility to map specific tagsets to the universal tagset. Maybe one of the suggested solutions in the discussion can help:

This is apparently not supported in NLTK yet, but see Dan Zeman's Interset tool or my script at https://gist.github.com/nschneid/6476715