DTD parsing with axiom

378 Views Asked by At

I'm trying to use axiom 1.2.22 with woodstox 6.2.6 to parse an XML document with a doctype. (I'm using OpenJDK 11 but that shouldn't make any difference.) I'm getting the same error that was mentioned in How to ignore DTD parsing in Apache's AXIOM :

Cannot create OMDocType because the XMLStreamReader doesn't support the DTDReader extension

According to https://issues.apache.org/jira/browse/AXIOM-475 that was supposed to be fixed with axiom 1.2.16, but it seems the bug is back again.

Example snippet:

    InputStream is = Test.class.getResourceAsStream("xml-with-dtd.xml");
    OMXMLParserWrapper builder = OMXMLBuilderFactory.createStAXOMBuilder(XMLInputFactory.newFactory().createXMLStreamReader(is));
    OMElement result = builder.getDocumentElement();

Am I using incompatible versions? I also tried using woodstox 5.0.0, which throws the same error. I also verified that it's actually the woodstox XMLInputFactory when using XMLInputFactory.newFactory() that is used. These are the maven dependencies that I use (I've omitted some exclusions related to logging and duplicated classes):

  <dependency>
    <groupId>com.fasterxml.woodstox</groupId>
    <artifactId>woodstox-core</artifactId>
    <version>6.2.6</version>
  </dependency>
  <dependency>
    <groupId>org.codehaus.woodstox</groupId>
    <artifactId>stax2-api</artifactId>
    <version>4.2.1</version>
  </dependency>
  <dependency>
    <groupId>org.apache.ws.commons.axiom</groupId>
    <artifactId>axiom-impl</artifactId>
    <version>1.2.22</version>
  </dependency>
  <dependency>
    <groupId>org.apache.ws.commons.axiom</groupId>
    <artifactId>axiom-api</artifactId>
    <version>1.2.22</version>
  </dependency>

Update: Looks a lot like the axiom code tries to determine a DTDReader class to use from a configuration property. Unfotunately setting the property DTDReader.PROPERTY in the XMLInputFactory to any value results in the following stack trace:

Exception in thread "main" java.lang.IllegalArgumentException: Unrecognized property 'org.apache.axiom.ext.stax.DTDReader'
    at com.ctc.wstx.api.CommonConfig.reportUnknownProperty(CommonConfig.java:167)
    at com.ctc.wstx.api.CommonConfig.setProperty(CommonConfig.java:158)
    at com.ctc.wstx.api.ReaderConfig.setProperty(ReaderConfig.java:35)
    at com.ctc.wstx.stax.WstxInputFactory.setProperty(WstxInputFactory.java:400)
1

There are 1 best solutions below

0
hwbllmnn On BEST ANSWER

I'm not sure why it didn't work when I tried it with woodstox 5, but this little patch against axiom 1.2.22 solves the problem at least for woodstox 6.2.6:

Index: axiom-api/src/main/java/org/apache/axiom/util/stax/dialect/StAXDialectDetector.java
===================================================================
--- axiom-api/src/main/java/org/apache/axiom/util/stax/dialect/StAXDialectDetector.java (revision 1891409)
+++ axiom-api/src/main/java/org/apache/axiom/util/stax/dialect/StAXDialectDetector.java (working copy)
@@ -274,6 +274,7 @@
                     return new Woodstox4Dialect(version.getComponent(1) == 0 && version.getComponent(2) < 11
                             || version.getComponent(1) == 1 && version.getComponent(2) < 3);
                 case 5:
+                case 6:
                     return new Woodstox4Dialect(false);
                 default:
                     return null;

Update:

Version 1.3.0 of axiom also fixes the problem.