Is there a way to remove namespaces from xml tags using C++/Boost or using some other library

1.6k Views Asked by At

Is there a way in C++, using Tinyxml , TinyXpath , such that, a string containing :

<ns:abcd>
  <ns:defg>
    <ns:hijk>
    </ns:hijk>
  </ns:defg>
</ns:abcd>

transforms to

<abcd>
  <defg>
    <hijk>
    </hijk>
  </defg>
</abcd>

EDIT:

I was using Tinyxml and Tinyxpath.

My workflow was :

a) Create a dom-tree using TinyXML

b) Pass dom-tree to Tinyxpath for xpath evaluations

To add namespace removal, I used following function :

void  RemoveAllNamespaces(TiXmlNode* node)
{
    TiXmlElement* element = node->ToElement();
    if(!element){
        return; 
    }
    std::string elementName = element->Value(); 
    std::string::size_type idx = elementName.rfind(':');
    if(idx != std::string::npos)
    { 
        element->SetValue(elementName.substr( idx + 1).c_str());
    }
    TiXmlNode* child = element->IterateChildren(NULL);
    while(child)
    {
        RemoveAllNamespaces(child);
        child = element->IterateChildren(child);
    }
}

So workflow changed to :

a) Create a dom-tree using TinyXML

b) Remove namespace from the domtree using RemoveAllNamespaces(domtree.Root() )

c) Pass modified-dom-tree to Tinyxpath for xpath evaluations

2

There are 2 best solutions below

0
On BEST ANSWER

Ok, in response to the edited question, a few notes:

  • that doesn't actually treat the namespaces (consider xmlns="http://blabla.com/uri" style default namespaces), but that's actually a TinyXml limitation (eek):

    Further, TinyXML has no facility for handling XML namespaces. Qualified element or attribute names retain their prefixes, as TinyXML makes no effort to match the prefixes with namespaces.

  • it doesn't treat attributes (which can also be qualified)

Here's what I'd do a quick & dirty (assumes TIXML_USE_STL as you were supposedly already using):

static inline std::string RemoveNs(std::string const& xmlName)
{
    return xmlName.substr(xmlName.find_last_of(":") + 1);
}

void  RemoveAllNamespaces(TiXmlNode* node)
{
    assert(node);

    if (auto element = node->ToElement()) {
        element->SetValue(RemoveNs(element->Value()));

        for (auto attr = element->FirstAttribute(); attr; attr = attr->Next())
            attr->SetName(RemoveNs(attr->Name()));

        for (auto child = node->IterateChildren(nullptr); child; child = element->IterateChildren(child))
            RemoveAllNamespaces(child);
    }
}

On my MSVC test it prints

<?xml version="1.0" standalone="no"?>
<!-- Our: to do list data -->
<ToDo a="http://example.org/uri1">
  <!-- Do I need: a secure PDA? -->
  <Item priority="1" distance="close">Go to the<bold>Toy store!</bold></Item>
  <Item priority="2" distance="none">Do bills</Item>
  <Item priority="2" distance="far &amp; back">Look for Evil Dinosaurs!</Item>
</ToDo>
6
On

I would employ an XSLT transform here:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output omit-xml-declaration="yes" indent="yes" />

    <xsl:template match="*">
        <xsl:element name="{name()}" namespace=""><xsl:apply-templates select="node()|@*"/></xsl:element>
    </xsl:template>
    <xsl:template match="@*">
        <xsl:attribute name="{name()}" namespace=""><xsl:value-of select="."/></xsl:attribute>
    </xsl:template>
</xsl:stylesheet>

Note that on elements/attribute, namespace="" clears the namespace. You can also specify a different namespace instead.

With input.xml like

<?xml version="1.0"?>
<ns:abcd xmlns:ns="http://bla/bla">
  <ns:defg attr="value">
    <ns:hijk>
    </ns:hijk>
  </ns:defg>
</ns:abcd>

xsltproc xform.xsl input.xml prints:

<abcd>
<defg attr="value">
    <hijk>
    </hijk>
</defg>
</abcd>