One can use lxml
to validate XML files against a given XSD schema.
Is there a way to apply this validation in a less strict sense, ignoring all elements which contain special expressions?
Consider the following example: Say, I have an xml_file:
<foo>99</foo>
<foo>{{var1}}</foo>
<foo>{{var2}}</foo>
<foo>999</foo>
Now, I run a program on this file, which replacing the {{...}}
-expressions and produces a new file:
xml_file_new:
<foo>99</foo>
<foo>23</foo>
<foo>42</foo>
<foo>999</foo>
So far, I can use lxml
to validate the new XML file as follows:
from lxml import etree
xml_root = etree.parse(xml_file_new)
xsd_root = etree.parse(xsd_file)
schema = etree.XMLSchema(xsd_root)
schema.validate(xml_root)
The relevant point in my example is that the schema restricts the <foo>
contents to integers.
It would not be possible to apply the schema on the old xml_file
in advance, however, as my program does some other expensive tasks, I would like to do exactly that while ignoring all lines containing any {{...}}
-expressions:
<foo>99</foo> <!-- should be checked-->
<foo>{{var1}}</foo> <!-- should be ignored -->
<foo>{{var2}}</foo> <!-- should be ignored -->
<foo>999</foo> <!-- should be checked-->
EDIT: Possible solution approach: One idea would be to define two schemas
- a strict second schema for the new file, allowing only integers
- a relaxed schema for the old file, allowing both integers and arbitrary strings with
{{..}}
-expressions
However, to avoid the redundant task of keeping two schemas synchronized, one would need a way to generate the relaxed from the strict schema automatically. This sounds quite promising, as both schemas have the same structure, only differing in the restrictions of certain element contents. Is there a simple concept offered by XSD which allows to just "inherit" from one schema and then "attach" additional relaxations to individual elements?
To answer the edited question, it is possible to compose schemas with the
xs:include
(andxs:import
) mechanism. This way, you can declare common parts in a common schema for reuse, and use dedicated schemas for specialized type definitions, like so:The common schema that describes the structure. Note that it uses
FooType
, but does not define it:The relaxed schema to validate before the replacement. It includes the compontents from the common schema, and defines a relaxed
FooType
:The strict schema to validate after the replacement. It defines the strict version of
FooType
:For completions sake, there also are alternative ways to do this, for example with
xs:redefine
(XSD 1.0) orxs:override
(XSD 1.1). But these have more complex semantics and personally, I try to avoid them.