I'm working on a project where one of its requirements is to calculate the similarity between words. I'm using WuP measure to calculate the similarity between the words which supposed to return values between [0,1]. The problem is that the jar file seems to have bugs where it doesn't return values in this range. The web page demo works perfectly where for identical words it returns the maximum value 1 but the jar file doesn't return the same. The results for run( "java","java" );
is:
run:
edu.cmu.lti.ws4j.impl.HirstStOnge 1.7976931348623157E308
edu.cmu.lti.ws4j.impl.LeacockChodorow 1.7976931348623157E308
edu.cmu.lti.ws4j.impl.Lesk 1.7976931348623157E308
edu.cmu.lti.ws4j.impl.WuPalmer 1.7976931348623157E308
edu.cmu.lti.ws4j.impl.Resnik 1.7976931348623157E308
edu.cmu.lti.ws4j.impl.JiangConrath 1.7976931348623157E308
edu.cmu.lti.ws4j.impl.Lin 1.7976931348623157E308
edu.cmu.lti.ws4j.impl.Path 1.7976931348623157E308
Done in 8 msec.
BUILD SUCCESSFUL (total time: 0 seconds)
The problem is not only with identical words, even for different words it gave wup valueout of the range:
The webpage demo :
wup( avocado#n#1 , fruit#n#1 ) = 0.9091
jcn( avocado#n#1 , fruit#n#1 ) = 0.5974
lch( avocado#n#1 , fruit#n#1 ) = 2.5903
lin( avocado#n#1 , fruit#n#1 ) = 0.8982
res( avocado#n#1 , fruit#n#1 ) = 7.3837
path( avocado#n#1 , fruit#n#1 ) = 0.3333
lesk( avocado#n#1 , fruit#n#1 ) = 203
hso( avocado#n#1 , fruit#n#1 ) = 6
Jar file values:
run:
edu.cmu.lti.ws4j.impl.HirstStOnge 6.0
edu.cmu.lti.ws4j.impl.LeacockChodorow 2.5902671654458267
edu.cmu.lti.ws4j.impl.Lesk 6.0
edu.cmu.lti.ws4j.impl.WuPalmer 1.0526315789473684
edu.cmu.lti.ws4j.impl.Resnik 7.383733213970693
edu.cmu.lti.ws4j.impl.JiangConrath 0.5973799749775183
edu.cmu.lti.ws4j.impl.Lin 0.8981855517382724
edu.cmu.lti.ws4j.impl.Path 0.3333333333333333
Done in 1673 msec.
BUILD SUCCESSFUL (total time: 1 second)
Could someone help how to fix this problem
1.7976931348623157E308 is the maximum double number. It is as if it represents infinite, because the two words are identical.
Try "hi" and "hello" and it returns 1.0