Dependency of WS4J on some configuration files and WordNet (200Mb)

305 Views Asked by At

I am using WS4J API for calculating semantic similarity between words:

ILexicalDatabase db = new NictWordNet();
RelatednessCalculator lin = new Lin(db);
RelatednessCalculator wup = new WuPalmer(db);

String w1 = "science";
String w2 = "university";
System.out.println(lin.calcRelatednessOfWords(w1, w2));
System.out.println(wup.calcRelatednessOfWords(w1, w2));

The problem is that this API depends on the following configuration files which must be placed into the project's directory (I use /resources for this purpose):

jaw.jaw.conf
similarity.conf
wordnet folder

Moreover it's a pity that this library is unavailable in Maven repository.

Is there any way to avoid putting the above-mentioned files into my local project's folder? These files occupy over 100Mb....

I also checked the library DISCO, but it doesn't seem to be so powerful as WS4J.

1

There are 1 best solutions below

0
On

Apparently, do to so you have to modify WS4J.

For instance, the similarity.conf file is loaded by the class WS4JConfiguration through an InputStream:

final public class WS4JConfiguration {

    private final static String CONF = "/similarity.conf";

     ...

    private WS4JConfiguration(){
        InputStream stream = null;
        try {
            stream = WS4JConfiguration.class.getResourceAsStream( CONF );

So, WS4JConfiguration loads resources using the same class loader that loaded your application and it should have access to resources in your JARs.