How to use the term position parameter in Xapian query constructors

508 Views Asked by At

Xapian docs talk about a query constructor that takes a term position parameter, to be used in phrase searches:

Quote:

This constructor actually takes a couple of extra parameters, which may be used to specify positional and frequency information for terms in the query:

Xapian::Query(const string & tname_,
         Xapian::termcount wqf_ = 1,
         Xapian::termpos term_pos_ = 0)

The term_pos represents the position of the term in the query. Again, this isn't useful for a single term query by itself, but is used for phrase searching, passage retrieval, and other operations which require knowledge of the order of terms in the query (such as returning the set of matching terms in a given document in the same order as they occur in the query). If such operations are not required, the default value of 0 may be used.

And in the reference, we have:

Xapian::Query::Query  (   const std::string &     tname_,
      Xapian::termcount   wqf_ = 1,
      Xapian::termpos     pos_ = 0     
  )           

A query consisting of a single term.

And:

typedef unsigned  termpos

A term position within a document or query.

So, say I want to build a query for the phrase: "foo bar baz", how do I go about it?! Does term_pos_ provide relative position values, ie define the order of terms within the document:
(I'm using here the python bindings API, as I'm more familiar with it)

 q = xapian.Query(xapian.Query.OP_AND, [xapian.Query("foo", wqf, 1),xapian.Query("bar", wqf,2),xapian.Query("baz", wqf,3)] )

And just for the sake of testing, suppose we did:

 q = xapian.Query(xapian.Query.OP_AND, [xapian.Query("foo", wqf, 3),xapian.Query("bar", wqf, 4),xapian.Query("baz", wqf, 5)] )

So this would give the same results as the previous example?!

And suppose we have:

 q = xapian.Query(xapian.Query.OP_AND, [xapian.Query("foo", wqf, 2),xapian.Query("bar", wqf, 4),xapian.Query("baz", wqf, 5)] )

So now this would match where documents have "foo" "bar" separated with one term, followed by "baz" ??

Is it as such, or is it that this parameter is referring to absolute positions of the indexed terms?!

Edit:

And how is OP_PHRASE related to this? I find some online samples using OP_PHRASE as such:

q = xapian.Query(xapian.Query.OP_PHRASE, term_list)

This makes obvious sense, but then what is the role of the said term_pos_ constructor in phrase searches - is it a more surgical way of doing things!?

1

There are 1 best solutions below

0
On
int pos = 1;
std::list<Xapian::Query> subs;
subs.push_back(Xapian::Query("foo", 1, pos++));
subs.push_back(Xapian::Query("bar", 1, pos++));
querylist.push_back(Xapian::Query(Xapian::Query::OP_PHRASE, subs.begin(), subs.end()));