xpath combo contains/translate not working properly

96 Views Asked by At

From searching through stackoverflow I found a solution to using xpath that allows case-insensitive search. I recently made some changes to the schema and when I returned to my search I found nothing when using this approach. Here is my schema:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
  <xs:element name="system">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="pData"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="pData">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="pNum"/>
        <xs:element ref="sData"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="pNum" type="xs:integer"/>
  <xs:element name="sData">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="sNum"/>
        <xs:element maxOccurs="unbounded" ref="hData"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="sNum" type="xs:NMTOKEN"/>
  <xs:element name="hData">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="hTitle"/>
        <xs:element ref="bData"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="hTitle" type="xs:string"/>
  <xs:element name="bData">
    <xs:complexType>
      <xs:sequence>
        <xs:element maxOccurs="unbounded" ref="sitData"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="sitData" >
    <xs:complexType mixed="true">
      <xs:sequence>
        <xs:element ref="sitTitle"/>
        <xs:element minOccurs="0" ref="sitInfo"/>
        <xs:choice>
          <xs:element ref="bothColumn"/>
          <xs:sequence>
            <xs:element ref="leftColumn"/>
            <xs:element ref="rightColumn"/>
          </xs:sequence>
        </xs:choice>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="sitTitle" type="xs:string"/>
  <xs:element name="sitInfo" type="xs:string"/>
  <xs:element name="bothColumn">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="bothTitle"/>
        <xs:element ref="bothInfo"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="bothTitle" type="xs:string"/>
  <xs:element name="bothInfo" type="xs:string"/>
  <xs:element name="leftColumn">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="leftTitle"/>
        <xs:element ref="leftInfo"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="leftTitle" type="xs:string"/>
  <xs:element name="leftInfo" type="xs:string"/>
  <xs:element name="rightColumn">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="rightTitle"/>
        <xs:element ref="rightInfo"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="rightTitle" type="xs:string"/>
  <xs:element name="rightInfo" type="xs:string"/>
</xs:schema>

So my original search would be:

return $doc/system/pData/sData/hData/bData/sitData[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'),$searchTerm)]

so my problem occurs when I search a term say "System", nothing would come up when I know there exists data in there, but if I search "system" all versions of system comes back. I couldn't seem to find someone else with this issue and while the search still does case insensitive with all lower case I'm perplexed and want to understand what is going on with my xpath search now. I'm utilizing marklogic for these xpath calls. here is a sample xml that would fit this schema:

<system>
    <pData>
        <pNumber>908957303</pNumber>
        <sData>
            <sNumber>12345</sNumber>
            <hData>
                <hTitle>What to expect</hTitle>
                <bData>
                    <sitData>
                        <sitTitle>A whole lot of fun</sitTitle>
                        <sitInfo> defined fun</sitInfo>
                        <leftColumn>
                            <leftTitle>to the left</leftTitle>
                            <leftInfo> all your clothes </leftInfo>
                        </leftColumn>
                        <rightColumn>
                            <rightTitle>to the right</rightTitle>
                            <rightInfo> right hand turns </rightInfo>
                        </rightColumn>
                    </sitData>
                    <sitData>
                        <sitTitle>we out here</sitTitle>
                        <sitInfo> doing this is painful </sitInfo>
                        <bothColumn>
                            <bothTitle>2001 was a good year</bothTitle>
                            <bothInfo>but it did have some downfalls</bothInfo>
                        </bothColumn>
                    </sitData>
                </bData>
            </hData>
            <hData>
                <hTitle>What to expect</hTitle>
                <bData>
                    <sitData>
                        <sitTitle>A whole lot of fun</sitTitle>
                        <sitInfo> defined fun</sitInfo>
                        <leftColumn>
                            <leftTitle>to the left</leftTitle>
                            <leftInfo> all your clothes </leftInfo>
                        </leftColumn>
                        <rightColumn>
                            <rightTitle>to the right</rightTitle>
                            <rightInfo> right hand turns </rightInfo>
                        </rightColumn>
                    </sitData>
                    <sitData>
                        <sitTitle>we out here</sitTitle>
                        <sitInfo> doing this is painful </sitInfo>
                        <bothColumn>
                            <bothTitle>2001 was a good year</bothTitle>
                            <bothInfo>but it did have some downfalls</bothInfo>
                        </bothColumn>
                    </sitData>
                </bData>
            </hData>
        </sData>
    </pData>
</system>
1

There are 1 best solutions below

1
On BEST ANSWER

You added MarkLogic as a tag, so if you're using MarkLogic you can leverage its text functions designed for things like this:

let $doc := ...
let $q := cts:word-query($searchTerm, "case-insensitive")
return $doc//sitData[cts:contains(., $q)]

This assumes you want the match to be on word boundaries. If you really want "foo" to match "food" then you can use wildcards.