How to apply regex on element which is complexType and mixed

95 Views Asked by At

I have generated a TEI xsd, that I have to make some changes on, I have "w" element that I have to apply a regex on its text content, let's say that I want the text to match [0-9].

Here's my xsd element :

  <xs:element name="w">
    <xs:annotation>
      <xs:documentation>(word) represents a grammatical (not necessarily orthographic) word. [17.1. Linguistic Segment Categories 17.4.2. Lightweight Linguistic Annotation]</xs:documentation>
    </xs:annotation>
    <xs:complexType mixed="true">
      <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:element ref="tei:w"/>
        <xs:element ref="tei:pc"/>
      </xs:choice>
      <xs:attributeGroup ref="tei:att.global.attributes"/>
      <xs:attributeGroup ref="tei:att.segLike.attributes"/>
      <xs:attributeGroup ref="tei:att.typed.attributes"/>
      <xs:attributeGroup ref="tei:att.linguistic.attributes"/>
      <xs:attributeGroup ref="tei:att.notated.attributes"/>
    </xs:complexType>
  </xs:element>

In the example below, the first one should be valid, and not the second.

<w lemma="ttt" type="PRP">5</w>
<w lemma = "pied" type="NOM">pieds</w>

Things I have tried but didn't work :

<xs:assert test="matches($value,'[0-9]')"/>
<xs:assert test="matches(w/text(),'[0-9]')"/>
<xs:assert test="matches($w,'[0-9]')"/>

Thanks for helping.

1

There are 1 best solutions below

0
On BEST ANSWER

Doing e.g. <xs:assert test="matches(., '^[0-9]$')"/>

<xs:complexType mixed="true">
  <xs:choice minOccurs="0" maxOccurs="unbounded">
    <xs:element ref="tei:w"/>
    <xs:element ref="tei:pc"/>
  </xs:choice>
  <xs:attributeGroup ref="tei:att.global.attributes"/>
  <xs:attributeGroup ref="tei:att.segLike.attributes"/>
  <xs:attributeGroup ref="tei:att.typed.attributes"/>
  <xs:attributeGroup ref="tei:att.linguistic.attributes"/>
  <xs:attributeGroup ref="tei:att.notated.attributes"/>
  <xs:assert test="matches(., '^[0-9]$')"/>
</xs:complexType>

should suffice to check the element contains e.g. only a single digit.