getting stackoverflowerror while converting org.w3c.dom.Document to org.dom4j.Document

1.1k Views Asked by At

I am getting stackoverflowerror while conveting org.w3c.dom.Document to org.dom4j.Document

Code :

public static org.dom4j.Document getDom4jDocument(Document w3cDocument)
    {
        //System.out.println("XMLUtility : Inside getDom4jDocument()");
        org.dom4j.Document dom4jDocument  = null;
        DOMReader xmlReader  = null;
        try{
            //System.out.println("Before conversion of w3cdoc to dom4jdoc");
            xmlReader = new DOMReader();            
            dom4jDocument = xmlReader.read(w3cDocument);
            //System.out.println("Conversion complete");
        }catch(Exception e){
            System.out.println("General Exception :- "+e.getMessage());
        }
        //System.out.println("XMLUtility : getDom4jDocument() Finished");
        return dom4jDocument;   
    } 

log :

java.lang.StackOverflowError
    at java.lang.String.indexOf(String.java:1564)
    at java.lang.String.indexOf(String.java:1546)
    at org.dom4j.tree.NamespaceStack.getQName(NamespaceStack.java:158)
    at org.dom4j.io.DOMReader.readElement(DOMReader.java:184)
    at org.dom4j.io.DOMReader.readTree(DOMReader.java:93)
    at org.dom4j.io.DOMReader.readElement(DOMReader.java:226)
    at org.dom4j.io.DOMReader.readTree(DOMReader.java:93)
    at org.dom4j.io.DOMReader.readElement(DOMReader.java:226)

Actually i want to convert XML to string by using org.dom4j.Document's asXML method. Is this conversion possible without converting org.w3c.dom.Document to org.dom4j.Document ? How ?

2

There are 2 best solutions below

0
On BEST ANSWER

when handling a heavy file, you shouldn't use a DOM reader, but a SAX one. I assume your goal is to output your document to a string.

Here you can find some differences between SAX and DOM (source) :

SAX

  • Parses node by node
  • Doesn’t store the XML in memory
  • We cant insert or delete a node
  • SAX is an event based parser
  • SAX is a Simple API for XML
  • doesn’t preserve comments
  • SAX generally runs a little faster than DOM

DOM

  • Stores the entire XML document into memory before processing
  • Occupies more memory
  • We can insert or delete nodes
  • Traverse in any direction.
  • DOM is a tree model parser
  • Document Object Model (DOM) API
  • Preserves comments
  • SAX generally runs a little faster than DOM

You don't need to produce a model which will need a lot of memory space. You only need to crawl through nodes to output them one by one.

Here, you will find some code to start with ; then you should implement a tree traversal algorithm.

Regards

0
On

Take a look at java.lang.StackOverflowError in dom parser. Apparently trying to load a huge XML file into a String can result in a StackoverflowException. I think it's because the parser uses regex's to find the start and end of tags, which involves recursive calls for long Strings as described in java.lang.StackOverflowError while using a RegEx to Parse big strings.

You can try and split up the XML document and parse the sections separately and see if that helps.