I wish to validate an XML file (which has no namespace, nor schema declaration in the XML file) against an XSD file.
The XML file looks like:
<?xml version="1.0" encoding="UTF-8"?>
<report> ... </report>
And the XSD file, is:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="report">
...
</xs:element>
</xs:schema>
The code is similar to the following:
SchemaFactory xsdFac = SchemaFactory.newInstance(
XMLConstants.W3C_XML_SCHEMA_NS_URI);
xsdFac.setErrorHandler( ... );
Schema xsd = xsdFac.newSchema(xsdUrl); // xsdUrl works
Validator xsdValidator = xsd.newValidator();
xsdValidator.validate(new DOMSource(doc)); // doc is correct
I have spent 3 hours on this, as it works locally on my PC, it used to work on the server (I think), but now it does not work on the server. I tried to identify differences between my PC and the server, but they both use the same JARs, and so on.
Anyway, I have identified the following difference. I was not passing a class name to the SchemaFactory.newInstance
method (almost certainly a mistake), when i printed out the class name of the xsdFac
, I saw it was different between locally and on the server. I don't think I want to get in to why that was the case (I have no idea), I think it's better to find one which works, and explicitly specify it.
- On my PC (works) it was
com.sun.org.apache.xerces.internal.jaxp.validation.XMLSchemaFactory
. If I explicitly specify that on thenewInstance
call, then it works on the PC and on the server. - On the server (didn't work), it was
org.apache.xerces.jaxp.validation.XMLSchemaFactory
. If I explicitly set that on thenewInstance
call, then I get the same error on both the PC and the server.
Notice the com.sun
at the start of the one that works.
So at least I have a solution, that's good. But, I think I shouldn't be using com.sun
classes explicitly in my code?
Other information:
- The error I get is:
org.xml.sax.SAXParseException: cvc-elt.1: Cannot find the declaration of element 'report'
- This is the same error I get when, on the working version, I change the element name to something else, something that doesn't exist in the schema. So I think it just means, "I understand your schema, and I understand the XML, and you've used an element that isn't declared anywhere in the schema."
xercesImpl-2.9.1.jar
is in my project (both on the server and on the PC).- This JAR contains an
XMLSchemaFactory.class
under the package specified on the non-working version (i.e. would lead me to believe it's there, findable, and should work, and after all I don't get any exceptions relating to classes not found) - The
doc
object I am parsing is, in both cases, aorg.apache.xerces.dom.DeferredDocumentImpl
So, my question is: I would like to use the explicit Xerces implementation of the schema factory (I think), as I am including the JAR and I have a Document object from Xerces and basically I am using the Xerces validator anyway (in the working case of com.sun
).
Has anyone has a similar experience?
I would hazard a guess that you don't, in fact, have Xerces on the classpath in the failing case.
-Djaxp.debug will tell you for sure what implementation is active when you call the JAXP APIS that you are calling.
If this is so, and it isn't just the result of a mistake in your classpath, you can indeed specifically 'new' the Xerces classes instead of calling the JAXP generic factory methods. I've done it.
edit
comments indicate that this is a webapp. The JAXP classes are one of those things which are painful to put in a webapp, given the class loading rules. You should at least try putting it in the 'system' classpath of tomcat and see what happens.