I have an XML document, and I am trying to separate the nodes from each other. I want the root node alone, then the second node alone, and then a list of the nodes that exists within the second nodes. I have run into the problem where when I remove nodes from the second node or main node, my list becomes empty. I don't understand why this happens, especially because of this weird behavior below.
class Program
{
static void Main(string[] args)
{
XmlDocument doc = new XmlDocument();
doc.Load(@"C:\Users\Username\Desktop\Diagram.xml");
XmlNode rootNode = doc.DocumentElement;
XmlNode secondNode = doc.SelectSingleNode(rootNode.Name + "/root");
XmlNodeList nodelist = doc.SelectNodes("//root/mxCell");
Console.WriteLine("-----------------------------------------");
Console.WriteLine(RemoveChildren(rootNode).OuterXml);
Console.WriteLine("-----------------------------------------");
Console.WriteLine(RemoveChildren(secondNode).OuterXml);
Console.WriteLine("-----------------------------------------");
//Console.WriteLine(rootNode.OuterXml);
Console.WriteLine(nodelist.Count); //Becomes 0
if (nodelist != null && nodelist.Count > 0)
{
foreach (XmlNode n in nodelist)
{
Console.WriteLine(n.OuterXml);
}
}
Console.ReadLine();
}
private static XmlNode RemoveChildren(XmlNode n) {
while (n.FirstChild != null)
{
n.RemoveChild(n.FirstChild);
}
return n;
}
}
If I run this code, my nodelist.count is going to become 0. Why does the nodelist become 0, but why can I still access the second node?
However, if I add the foreach loop right after doc.SelectNodes("//root/mxCell"); the count is going to become 4.
like this,
class Program
{
static void Main(string[] args)
{
XmlDocument doc = new XmlDocument();
doc.Load(@"C:\Users\Username\Desktop\Diagram.xml");
XmlNode rootNode = doc.DocumentElement;
XmlNode secondNode = doc.SelectSingleNode(rootNode.Name + "/root");
XmlNodeList nodelist = doc.SelectNodes("//root/mxCell");
// Added code here
if (nodelist != null && nodelist.Count > 0)
{
foreach (XmlNode n in nodelist)
{
Console.WriteLine(n.OuterXml);
}
}
// End of added code
Console.WriteLine("-----------------------------------------");
Console.WriteLine(RemoveChildren(rootNode).OuterXml);
Console.WriteLine("-----------------------------------------");
Console.WriteLine(RemoveChildren(secondNode).OuterXml);
Console.WriteLine("-----------------------------------------");
//Console.WriteLine(rootNode.OuterXml);
Console.WriteLine(nodelist.Count); //Becomes 4
if (nodelist != null && nodelist.Count > 0)
{
foreach (XmlNode n in nodelist)
{
Console.WriteLine(n.OuterXml);
}
}
Console.ReadLine();
}
private static XmlNode RemoveChildren(XmlNode n) {
while (n.FirstChild != null)
{
n.RemoveChild(n.FirstChild);
}
return n;
}
}
The count is now 4.
Here is the xml used:
<mxGraphModel dx="1086" dy="596" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="850" pageHeight="1100" math="0" shadow="0">
<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
<mxCell id="YJb7HCrh72y2aGPrfETQ-1" value="" style="endArrow=classic;html=1;" parent="1" edge="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="130" y="310" as="sourcePoint"/>
<mxPoint x="180" y="260" as="targetPoint"/>
</mxGeometry>
</mxCell>
<mxCell id="YJb7HCrh72y2aGPrfETQ-2" value="" style="endArrow=classic;html=1;" parent="1" edge="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="290" y="270" as="sourcePoint"/>
<mxPoint x="340" y="220" as="targetPoint"/>
</mxGeometry>
</mxCell>
</root>
</mxGraphModel>


The behavior you are observing can be demonstrated more simply as follows. The following unit test will succeed (demo fiddle here):
While the following will fail:
Why might the addition of a mere
nodelist.Countbefore removing some nodes cause an inconsistency in the contents ofnodelistafter the nodes are removed?As it turns out, there is no specified behavior for the
XmlNodeListreturned bySelectNodes()in this situation. From the documentation remarks forXmlNode.SelectNodes():Such "unexpected results" are what you are observing. Once
RemoveChildren()is called the contents ofnodelistare not specified or guaranteed by .Net. (In fact, it appears that theXmlNodeListreturned uses a lazy evaluation mechanism. As soon as the node list is counted or iterated through (but not before), the XPath query is evaluated once and only once and the results cached and subsequently reused.)This documented restriction on node lists returned by XPath queries is in contrast to the general documentation for
XmlNodeListwhich states:That remark appears to apply only to
XmlNodeListchild lists returned by methods ofXmlNodeDOM objects.To avoid the inconsistency, you can explicitly materialize the
XmlNodeListto aList<XmlNode>immediately after evaluation like so:Demo fiddle here.
Switching to LINQ to XML would be another option, as it's generally easier to work with than the older
XmlDocumentAPI.