Calculate conceptual and relation similarity of two words in Java

94 Views Asked by At

I am implementing a readability formula in Java based on this paper.

I reached the point where I have to compute the conceptual and the relational similarity of two or more words.

They say:

We use Latent Semantic Analysis (LSA) tools to compute word similarity. LSA can derive semantic information, including similarity, from a word-document co-occurrence matrix. Word/term co-occurrences are counted in a moving window of a fixed size that scans the entire corpus. The co-occurrence models using windowsizes of +-1 and +-4 considered as relational similarity and conceptual semantic models, respectively.

I tried to see some implementations of LSA, like this one, but couldn't find a straightforward way to get what I want.

I supposedly need to have a matrix based on the words, so I tried to use WS4J library to compute the matrix based on two arrays of Strings.

WS4J also has a method calcRelatednessOfWords() but the results it gets don't match with the ones shown in the paper.

Is there any library that offers what I want? Or can anyone point me in the right direction?

0

There are 0 best solutions below