I have a while loop going through an XML file, and for one of the nodes "url", there are sometimes invalid values within it. I put a try-catch statement around this to catch any invalid values. The problem is, whenever an invalid value is grabbed the while loop is killed and the program continues on outside of that loop. I need the while loop to continue reading through the rest of the XML file after an invalid value if found.
Here is my code:
XmlTextReader reader = new XmlTextReader(fileName);
int tempInt;
while (reader.Read())
{
switch (reader.Name)
{
case "url":
try
{
reader.Read();
if (!reader.Value.Equals("\r\n"))
{
urlList.Add(reader.Value);
}
}
catch
{
invalidUrls.Add(urlList.Count);
}
break;
}
}
I chose not to include the rest of the switch statement as it is not relevant. Here is a sample of my XML:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<visited_links_list>
<item>
<url>http://www.grcc.edu/error.cfm</url>
<title>Grand Rapids Community College</title>
<hits>20</hits>
<modified_date>10/16/2012 12:22:37 PM</modified_date>
<expiration_date>11/11/2012 12:22:38 PM</expiration_date>
<user_name>testuser</user_name>
<subfolder></subfolder>
<low_folder>No</low_folder>
<file_position>834816</file_position>
</item>
</visited_links_list>
The exception I get throughout the code is similar to the following:
"' ', hexadecimal value 0x05, is an invalid character. Line 3887, position 13."
Observation:
You're calling
reader.Read()
twice for each entry. Once inwhile()
, and once within thecase
. Do you really mean to skip records? This will cause an exception if there are an odd number of entries in the source XML (sincereader.Read()
advances the pointer within the XML stream to the next item), but that exception will not be caught because it happens outside of yourtry...catch
.Beyond that:
Edit
At the risk of writing a novel-length answer...
The error you're receiving
"' ', hexadecimal value 0x05, is an invalid character. Line 3887, position 13."
indicates that your source XML is malformed, and somehow wound up with a
^E
(ASCII 0x05) at the specified position. I'd have a look at that line. If you're getting this file from a vendor or a service, you should have them fix their code. Correcting that, and any other malformed content within your XML, should correct issue that you're seeing.Once that is fixed, your original code should work. However, using
XmlTextReader
for this isn't the most robust of solutions, and involves building some code that Visual Studio will happily generate for you:In VS2012 (I don't have VS2010 installed any more, but it should be the same process):
Add a sample of the XML to your solution
In the properties for that file, set the CustomTool to "MSDataSetGenerator" (without the quotes)
The IDE should generate a .designer.cs file, containing a serializable class with a field for each item in the XML. (If not, right-click on the XML file in the solution explorer and select "Run Custom Tool".)
Use code like the following to load XML with the same schema as your sample at runtime:
You can also do this with the XSD.exe tool included with the Windows SDK.