Extract enumerations with documentation from XSD file in Python

700 Views Asked by At

I'm trying to write a function to get the description of some values from a XSD file, with a structure like this

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" attributeFormDefault="unqualified" elementFormDefault="qualified">
    <xs:element name="File">
        <xs:annotation>
            <xs:documentation>List of centers</xs:documentation>
        </xs:annotation>
        <xs:complexType>
            <xs:sequence>
                <xs:element maxOccurs="unbounded" name="Register">
                    <xs:annotation>
                        <xs:documentation>Center list registers</xs:documentation>
                    </xs:annotation>
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element name="CenterType">
                                <xs:annotation>
                                    <xs:documentation>Type of center  </xs:documentation>
                                </xs:annotation>
                                <xs:simpleType>
                                    <xs:restriction base="xs:int">
                                        <xs:totalDigits value="1"/>
                                        <xs:enumeration value="1">
                                            <xs:annotation>
                                                <xs:documentation>Own center</xs:documentation>
                                            </xs:annotation>
                                        </xs:enumeration>
                                        <xs:enumeration value="2">
                                            <xs:annotation>
                                                <xs:documentation>External center</xs:documentation>
                                            </xs:annotation>
                                        </xs:enumeration>
                                        <xs:enumeration value="3">
                                            <xs:annotation>
                                                <xs:documentation>Associated center</xs:documentation>
                                            </xs:annotation>
                                        </xs:enumeration>
                                        <xs:enumeration value="4">
                                            <xs:annotation>
                                                <xs:documentation>Other</xs:documentation>
                                            </xs:annotation>
                                        </xs:enumeration>
                                    </xs:restriction>
                                </xs:simpleType>
                            </xs:element>
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

By example, if I put

get_value("CenterType", "1")

my function might return "Own center"

I'm using Python 3.8 with XMLSchema.

I wrote this snippet, and I get to print the tag of all elements

xsd_xml = xmlschema.XMLSchema(xsd_file)     
fichero = xsd_xml.elements["File"][0]
            
for elem in fichero:
    print(elem.tag)

But I need to access to the enumeration and documentation fields. How can I extract this data?

2

There are 2 best solutions below

0
On BEST ANSWER

Finally, I solved my problem using LXML and XMLSchema namespace

def get_value(self, field: str, code: str, file: str):
        
        desc = ""
            
        xsd_xml = ET.parse(file)
        search_elem = f".//{{http://www.w3.org/2001/XMLSchema}}element[@name='{field}']"
        element = xsd_xml.find(search_elem)
            
        search_enum = f".//{{http://www.w3.org/2001/XMLSchema}}enumeration[@value='{code}']"
        enumeration = element.find(search_enum)
            
        if enumeration is not None:
            documentation = enumeration.find(".//{http://www.w3.org/2001/XMLSchema}documentation")
            desc = documentation.text
        else:
            desc = "N/A"
            
        return desc
0
On

Use untangle

import untangle

xsd_file = "C:\\code\\python\\xsd\\test.xsd"
obj = untangle.parse(xsd_file)

res = obj.xs_schema.xs_element.xs_complexType.xs_sequence.xs_element.xs_complexType.xs_sequence.xs_element.xs_simpleType.xs_restriction.xs_enumeration