I am using pylucne to build a search system. I am using TREC data to test my system. I have successfully written the indexer and searcher code. Now I want to use TREC topics to evaluate my system. To do this there is a class named TrecTopicsReader()
which reads the queries from the TREC formatted topics file. But readQueries(BufferedReader reader)
of that class needs a BufferedReader
topics file object passed to it.
How to do this in pylucene. BufferedReader is not available in pylucene JCC.
After waiting for some one to answer, I also asked this question on pylucene developer mailing list.
Andi Vajda replied there. I am answering this question on Andi's behalf.
Quoting Andi:
More information:
In the Makefile of pyLucene you will find this line
GENERATE=$(JCC) $(foreach jar,$(JARS),--jar $(jar)) \
. In this there should be a line like--package java.io
, add the class(BufferedReader) you want to add to JCC so that it will be available to the python code.Then compile and install the pylucene again. (You can find the info about compilation & installation at PyLucene's documentation or you can also use this).
Also, for making a
BufferedReader
object from a file you will needFileReader
. So add that also.Just for Completenes: After adding this line my
GENERATE
will look like:Doing this doesn't suffice, you also have to compile the lucene benchmark lib, which is not included in the installation libs by default, because
TrecTopicsReader
is present in benchmark api. To compile and install benchmark: You have to modify the build.xml inside the main lucene folder, where the benchmark folder is present and then you have to include this jar in main Makefile to install it into python libs as egg.build.xml: You have to three modifications. For simplicity follow the
jar-test-framework
and wherever this is present try to create the similar pattern forjar-benchmark
.The three changes you have to do are:
1)
<target name="package" depends="jar-core, jar-test-framework, build-modules, init-dist, documentation"/>
replace it with<target name="package" depends="jar-core, jar-test-framework, jar-benchmark, build-modules, init-dist, documentation"/>
2) For the rule
replace it with
3) Add the following target/rule after the target named
jar-test-framework
MakeFile: Here also you have to do three modifications. For simplicity follow
HIGHLIGHTER_JAR
and add similar rules forBENCHMARK_JAR
. The three changes you have to are:1) Find
JARS+=$(HIGHLIGHTER_JAR)
and addJARS+=$(BENCHMARK_JAR)
after that in similar manner.2) Find
HIGHLIGHTER_JAR=$(LUCENE)/build/highlighter/lucene-highlighter-$(LUCENE_VER).jar
and addBENCHMARK_JAR=$(LUCENE)/build/benchmark/lucene-benchmark-$(LUCENE_VER).jar
after this line in similar manner.3) Find the rule
$(ANALYZERS_JAR):
and another rule for$(BENCHMARK_JAR):
after that.For completeness here are my final Mkaefile and build.xml files.