How to count noun's hyponyms that does not have hyponyms with NLTK and WordNet?

648 Views Asked by At

I am trying to count all hyponyms of a noun that does not have hyponyms itself (that are terminal in the hierarchy of nouns below that noun). For example, for ‘entity’ (the noun highest in the hierarchy), the results should be the count of all nouns that does not have hyponyms (all nouns that are terminal in the hierarchy). For noun that is terminal itself, the number have to be 1. I have a list of nouns. The output have to give such count for each noun in the list.

After a lot of searching here, trials and errors, this is the code I came up (only the relevant part):

import nltk
from nltk.corpus import wordnet as wn

def get_hyponyms(synset): #function source:https://stackoverflow.com/questions/15330725/how-to-get-all-the-hyponyms-of-a-word-synset-in-python-nltk-and-wordnet?rq=1
    hyponyms = set()
    for hyponym in synset.hyponyms():
        hyponyms |= set(get_hyponyms(hyponym))
    return hyponyms | set(synset.hyponyms())

with open("list-nouns.txt", "rU") as wordList1:
    myList1 = [line.rstrip('\n') for line in wordList1]
    for word1 in myList1:
        list1 = wn.synsets(word1, pos='n')
        countTerminalWord1 = 0  #counter for synsets without hyponyms
        countHyponymsWord1 = 0  #counter for synsets with hyponyms
        for syn_set1 in list1:
            syn_set11a = get_hyponyms(syn_set1)
            n = len(get_hyponyms(syn_set1))  #number of hyponyms
            if n > 0:
                countHyponymsWord1 += n
            else:
                countTerminalWord1 += 1
            for syn_set11 in syn_set11a:
                syn_set111a = get_hyponyms(syn_set11)
                n = len(get_hyponyms(syn_set11))
                if n > 0:
                    countHyponymsWord1 += n
                else: 
                    countTerminalWord1 += 1
                #...further iterates in the same way for the following levels
        print (countHyponymsWord1)
        print (countTerminalWord1)

(The code also tries to calculate all nouns that does have hyponyms, but this is not essential).

The main problem is that I cannot repeat this code for the whole depth of the noun hierarchy of 19 steps. It soon gives ‘SystemError: too many statically nested blocks’.

Help or advice how to solve this will be greatly appreciated.

0

There are 0 best solutions below