I'm fairly new to Python and coding in general so sorry if this is a very simple question. I'm working with the python packages XMLschema to validate some very large xml files. When I use the following code to get the error messages i only get the paths for the errors. This is okey with there are only 5-6 different "knude" but i have files which have 200+ of "knude" which makes this knowlegde very unusefull. I would there for like to the line number so I can go to the xml file and correct it.
Code:
def get_validation_errors(xml_file, xsd_file):
schema = xmlschema.XMLSchema(xsd_file)
validation_error_iterator = schema.iter_errors(xml_file)
errors = list()
for idx, validation_error in enumerate(validation_error_iterator, start=1):
err = validation_error.__str__()
errors.append(err)
print(f'[{idx}] path: {validation_error.path} | reason: {validation_error.reason} | message: {validation_error.message}')
return errors
Results:
[1] path: /KnudeGroup/Knude[5]/StatusKode | reason: value must be one of [1, 2, 3, 4, 8] | message: failed validating 0 with XsdEnumerationFacets([1, 2, 3, 4, 8])
I have already tried reading the documentation and searched google and stackoverflow for an answer, but could not find any.
Load the XML instance document with lxml, that way you have
sourceline
property on a validation error (https://github.com/sissaschool/xmlschema/blob/v2.2.3/xmlschema/validators/exceptions.py#L90) e.g. a minimal example would beand that way then outputs a line number as
sourceline
e.g.sourceline: 2; path: /root/item[1] | reason: invalid value 'a' for xs:decimal | message: failed validating 'a' with XsdAtomicBuiltin(name='xs:decimal')
.