Remove all nodes in a specified namespace from XML

1.5k Views Asked by At

I have an XML document that contains some content in a namespace. Here is an example:

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:test="urn:my-test-urn">
    <Item name="Item one">
        <test:AlternativeName>Another name</test:AlternativeName>
        <Price test:Currency="GBP">124.00</Price>
    </Item>
</root>

I want to remove all of the content that is within the test namespace - not just remove the namespace prefix from the tags, but actually remove all nodes (elements and attributes) from the document that (in this example) are in the test namespace. My required output is:

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:test="urn:my-test-urn">
    <Item name="Item one">
        <Price>124.00</Price>
    </Item>
</root>

I'm currently not overly concerned if the namespace declaration is still present, for now I'd be happy with removing just the content within the specified namespace. Note that there may be multiple namespaces in the document to be modified, so I'd like to be able to specify which one I want to have the content removed.

I've tried doing it using .Descendants().Where(e => e.Name.Namespace == "test") but that is only for returning an IEnumerable<XElement> so it doesn't help me with finding the attributes, and if I use .DescendantNodes() I can't see a way of querying the namespace prefix as that doesn't seem to be a property on XNode.

I can iterate through each element and then through each attribute on the element checking each one's Name.Namespace but that seems inelegant and hard to read.

Is there a way of achieving this using LINQ to Xml?

2

There are 2 best solutions below

3
On BEST ANSWER

Iterating through elements then through attributes seems not too hard to read :

var xml = @"<?xml version='1.0' encoding='UTF-8'?>
<root xmlns:test='urn:my-test-urn'>
    <Item name='Item one'>
        <test:AlternativeName>Another name</test:AlternativeName>
        <Price test:Currency='GBP'>124.00</Price>
    </Item>
</root>";
var doc = XDocument.Parse(xml);
XNamespace test = "urn:my-test-urn";

//get all elements in specific namespace and remove
doc.Descendants()
   .Where(o => o.Name.Namespace == test)
   .Remove();
//get all attributes in specific namespace and remove
doc.Descendants()
   .Attributes()
   .Where(o => o.Name.Namespace == test)
   .Remove();

//print result
Console.WriteLine(doc.ToString());

output :

<root xmlns:test="urn:my-test-urn">
  <Item name="Item one">
    <Price>124.00</Price>
  </Item>
</root>
1
On

Give this a try. I had to pull the namespace from the root element then run two separate Linqs:

  1. Removes elements with the namespace
  2. Removes attributes with the namespace

Code:

string xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
    "<root xmlns:test=\"urn:my-test-urn\">" +
    "<Item name=\"Item one\">" +
    "<test:AlternativeName>Another name</test:AlternativeName>" +
    "<Price test:Currency=\"GBP\">124.00</Price>" +
    "</Item>" +
    "</root>";

XDocument xDocument = XDocument.Parse(xml);
if (xDocument.Root != null)
{
    string namespaceValue = xDocument.Root.Attributes().Where(a => a.IsNamespaceDeclaration).FirstOrDefault().Value;

    // Removes elements with the namespace
    xDocument.Root.Descendants().Where(d => d.Name.Namespace == namespaceValue).Remove();

    // Removes attributes with the namespace
    xDocument.Root.Descendants().ToList().ForEach(d => d.Attributes().Where(a => a.Name.Namespace == namespaceValue).Remove());

    Console.WriteLine(xDocument.ToString());
}

Results:

<root xmlns:test="urn:my-test-urn">
  <Item name="Item one">
    <Price>124.00</Price>
  </Item>
</root>

If you want to remove the namespace from the root element add the this line in the if statement after you get the namespaceValue

xDocument.Root.Attributes().Where(a => a.IsNamespaceDeclaration).Remove();

Results:

<root>
  <Item name="Item one">
    <Price>124.00</Price>
  </Item>
</root>