Getting XML version node using VTD-XML in Java

1.3k Views Asked by At

I'm parsing an XML document with VTD-XML library and need to get version tag from the document.

My document looks like this;

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<rootNode>
     <items>
        <item>
            <name>XXX</name>
            <lastName>YYY</lastName>
            <number>1234</number>
        </item>
        <item>
            <name>AAA</name>
            <lastName>BBB</lastName>
            <number>5678</number>
        </item>
        <item>
            <name>CCC</name>
            <lastName>DDD</lastName>
            <number>9012</number>
        </item>
     </items>
</rootNode>

I need to get this line.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

How can I do it?

2

There are 2 best solutions below

0
On

If you just need the token "1.0", get the TOKEN_DEC_ATTR_NAME token index corresponding to "version" the next one (index+1) is the one. Just iterate thru the tokens with the right token type and test the index value with built-in methods of Vtdnav class.

4
On

I don't know how to do it in VTD-XML. But the following answer explains how to do it in DOM. (For other readers: if DOM is not an option, then please ignore this answer)

Please note that the version attribute of the xml node, refers to the version of the applied XML standard. Just to avoid confusion, it does not act as a revision number for the content of the document.

That being said, you can use the Document#getXmlVersion method Similarly, there is also a getXmlEncoding() and getXmlStandalone()

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = factory.newDocumentBuilder();
Document document = documentBuilder.parse(new File("myFile.xml"));

String version = document.getXmlVersion();
String encoding = document.getXmlEncoding();
boolean standalone = document.getXmlStandalone();

You could print them like this:

System.out.println("<?xml version=\"" + version + "\" encoding=\"" + encoding + "\" standalone=\"" + (standalone? "yes" : "no") + "\" ?>");

EDIT:

In answer to the question: "How to detect if version was specified or not".

The version attribute is stored in a private field inside the DocumentImpl, and internally it is null when the version is not specified. In the org.apache.xerces.dom implementation, the getter provides the default value:

public String getXmlVersion() {
    return (version == null)?"1.0":version;
}

Unfortunately there are no getters to get the nullable form of the version field. But you could access it using reflection:

Field versionField = document.getClass().getDeclaredField("version");
versionField.setAccessible(true);
String version = versionField.get(document);
if (version == null) 
{ 
  System.out.println("version was not specified")
}