unexpected behaviour of "unfiltered" search in MarkLogic

88 Views Asked by At

Unfiltered search is giving wrong results.

please find the below xml samples and problem.

Sample:

<root>
    <id1>11</id1>
    <elem1>ee1</elem1>
    <ele2>ee2</ele2>
    <entry>
        <volume>10</volume>
        <issue>10</issue>
        <elemEntry>eleme</elemEntry>
    </entry>
    <entry>
        <volume>20</volume>
        <issue>20</issue>
        <elemEntry>eleme</elemEntry>
    </entry>
    <entry>
        <volume>20</volume>
        <issue>10</issue>
        <elemEntry>eleme</elemEntry>
    </entry>
    <entry>
        <volume>10</volume>
        <issue>20</issue>
        <elemEntry>eleme</elemEntry>
    </entry>
</root>

I have to get the entry nodes with value combination of <volume> & <issue> both should present under <entry> node (like : volume-10 & issue-10, volume-10 & issue-20)

As in the above example, I need the entire entry node as <volume> (10), <issue> (10).

It should not return me the other entry nodes as other entry nodes doesn't have the required volume (value 10), issue (value 10) combination.

Please find below the cts:search which I am doing.

cts:search(
    doc("/sample.xml")//entry,
    cts:and-query((
        cts:element-value-query(xs:QName("volume"), "10", ("case-insensitive","unstemmed")),
        cts:element-value-query(xs:QName("issue"), "10", ("case-insensitive","unstemmed"))
    )),
    "unfiltered"
)

Assume sample xml is stored in the DB with /sample.xml uri

Above query is returning me the other entries (<entry>) also.

If I will perform "filtered" search, above query is returning me the correct results.

Please tell me, why it is happening and what would be the solution.

If there is any other good way to get the entry nodes having combination of volume and issue please let me know.

1

There are 1 best solutions below

0
On

You should also consider changing your data model. MarkLogic is best used when 1 document = 1 row. You will have more efficient queries and can use smaller indexes if you follow that pattern. The indexes are all oriented around facts-in-document and to get subdocument restrictions like this you need to use positions, which can get expensive, or filtered searches, which are even more expensive.