How to get word hierarchy (e.g., hypernyms, hyponyms) using wordnet in R

1.9k Views Asked by At

I want to use the wordnet package in R to get the word hierarchies like: "animal" is the hypernym of "cat", and "apple" is the hyponym of "fruit". But the code I can find from R wordnet help file is like below to identify antonyms:

install.packages("wordnet", dependencies=TRUE)
library(wordnet)
filter <- getTermFilter("ExactMatchFilter", "cold", TRUE)
terms <- getIndexTerms("ADJECTIVE", 5, filter)
synsets <- getSynsets(terms[[1]])
related <- getRelatedSynsets(synsets[[1]],"!")
sapply(related, getWord)

How can I use the R wordnet package to find hypernyms and hyponyms of a word?

1

There are 1 best solutions below

0
On

you can replace "!" (which is for antonyms) in

related <- getRelatedSynsets(synsets[[1]],"!")

with other symbols depending on what you need.

See this link: http://wordnet.princeton.edu/man/wnsearch.3WN.html#sect4

Hypernyms would be "@"

Extension to original question:

I just started using WordNet and I am looking for something similar. For 'apple' I would like a hypernym tree giving me

  • 'fruit'
    • 'food'
      • 'solid matter'
        • 'physical entity'
          • etc...

as can be seen when clicking on inherited hypernyms on WordNet online http://wordnetweb.princeton.edu/perl/webwn

However, the following commands

filter <- getTermFilter(type="ExactMatchFilter", word="apple", ignoreCase=TRUE)
terms <- getIndexTerms("NOUN", 15, filter)
synsets <- getSynsets(terms[[1]])
related <- getRelatedSynsets(synsets[[1]], "@")
sapply(related, getWord)

will only give me

[[1]]
[1] "edible fruit"

[[2]]
[1] "pome"        "false fruit"

hence failing to provide me with lower levels of hypernyms

The key to climbing the hypernym tree is to use getRelatedSynsets( ) recursively.

Continuing with the above example, extracting synsets from apple's synsets:

related_2 <- getRelatedSynsets(related[[1]], "@")

And collecting the corresponding words:

sapply(related_2, getWord)

will yield:

[[1]]
[1] "produce"         "green goods"     "green groceries" "garden truck"   

[[2]]
[1] "fruit"

And going on step further:

related_3 <- getRelatedSynsets(related2[[1]], "@")

sapply(related_3, getWord)

will result in:

[,1]        
[1,] "food"      
[2,] "solid food"