How to use the MarkLogic thesaurus API on phrases?

220 Views Asked by At

In Marklogic we can expand a search to include terms from a thesaurus as well as the terms entered in the search.

xquery version "1.0-ml";
import module namespace thsr="http://marklogic.com/xdmp/thesaurus" at "/MarkLogic/thesaurus.xqy";

cts:search(
doc("/Docs/hamlet.xml")//LINE,
thsr:expand(
    cts:word-query("weary"), 
    thsr:lookup("/myThsrDocs/thesaurus.xml", "weary"),
    (), 
    (), 
    () )
)

Question is how to support below cases :

  • Apple AND Orange
  • Apple NOT Orange
  • Apple - Orange
  • Apple + Orange
  • form: 10-K
  • co: Apple
  • Apple Orange form:[10-K]
  • “Apple and Orange”
  • “Apple” Orange
2

There are 2 best solutions below

0
On

Use search:parse to parse the query string, yielding cts:query XML. Then use a recursive typeswitch function to walk the XML. Use thesaurus expansion on cts:word and cts:word terms.

0
On

I don't think thesaurus expansion is intended for these cases.

Instead, consider using the Search API and expanding the grammar to include variants on the boolean operators:

http://docs.marklogic.com/guide/search-dev/search-api#id_44520

To map form: and co: to the same index, again, consider using the Search API and defining multiple constraints for the same index:

http://docs.marklogic.com/guide/search-dev/search-api#id_95820