When I search by phrase "ph1 ph2" it finds texts that contains "ph1" or "ph2".
String line = "ph1 ph2";
QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, field, analyzer);
Query query = parser.parse(line);
Anybody knows how to search by 1) phrase ("ph1 ph2"). Example: This is sentence ph1 ph2. 2) phrase with maximum distance("ph1 ph2 ~3"). Example This ph1 is sentence ph2.
P.S I used standard Lucene Indexer to index my files. If this example is not clear view http://www.lucenetutorial.com/lucene-query-syntax.html
Here's full code:
String index = "C:/programs/lucenedemo/index";
String field = "contents";
IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(index)));
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
//QueryParser parser = new QueryParser(Version.LUCENE_40, field, analyzer);
String line = "ph1 ph2";
QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, field, analyzer);
Query query = parser.parse(line);
//doPagingSearch(searcher, query, hitsPerPage, raw, queries == null && queryString == null);
//doPagingSearch
TopDocs results = searcher.search(query, 300000);
ScoreDoc[] hits = results.scoreDocs;
System.out.println(results.totalHits);
for (int i=0;i<10;i++) {
Document doc = searcher.doc(hits[i].doc);
String path = doc.get("path");
if (path != null) System.out.println((i+1) + ". " + path);
}
//end of doPagingSearch
reader.close();
I'm not clear on exactly what you are looking for, but I believe it's one of:
"field:\"" + line + "\""
: Simple phrase query. Find the two adjacent ordered terms"field:\"" + line + "\"~3"
: Phrase query with slop. In order, but with up to three terms worth of separation in the two terms."field:(" + line + ")"
: Not a phrase query at all. Simple search for the two terms. Any order or distance is acceptable.You can see further options on query parser syntax in Lucene's query syntax documentation