I have a C# API that returns an XElement object. This XElement object is constructed via code that looks like -
string invalidXML = "a \v\f\0";
XElement fe = new XElement("Data", invalidXML);
Console.WriteLine(fe);
By observation, I know that when trying to pass an invalid XML character to the XElement constructor above, a System.Argument exception is thrown.
So as it turns out, XElement does not throw an error when a string with InvalidXML characters is passed through. If you try to print the XElement via say Console.WriteLine(fe), then you get an exception from the XMLWriter-
System.ArgumentException: '', hexadecimal value 0x0B, is an invalid character.
at System.Xml.XmlEncodedRawTextWriter.InvalidXmlChar(Int32 ch, Char* pDst, Boolean entitize)
at System.Xml.XmlEncodedRawTextWriter.WriteElementTextBlock(Char* pSrc, Char* pSrcEnd)
at System.Xml.XmlEncodedRawTextWriter.WriteString(String text)
at System.Xml.XmlEncodedRawTextWriterIndent.WriteString(String text)
at System.Xml.XmlWellFormedWriter.WriteString(String text)
at System.Xml.Linq.ElementWriter.WriteElement(XElement e)
at System.Xml.Linq.XElement.WriteTo(XmlWriter writer)
at System.Xml.Linq.XNode.GetXmlString(SaveOptions o)
at System.Xml.Linq.XNode.ToString()
at System.IO.TextWriter.WriteLine(Object value)
at System.IO.TextWriter.SyncTextWriter.WriteLine(Object value)
at System.Console.WriteLine(Object value)
at TestLoggingForUNIT.Program.Main(String[] args) in C:\Users\shivanshu\source\repos\TestLoggingForUNIT\TestLoggingForUNIT\Program.cs:line 29
To me it seems like XElement itself does not do any validation. It's when it's printed/serialized, in .NET, internally the XML writer is called and that's when an exception is thrown.
My question is, that does XElement, always guarantee that an exception will be thrown if an invalid XML character is passed.
In other words, do I need to check the string that I am passing for Invalid XML characters? Using something like XmlConvert.IsXmlChar(string)?
I looked at the link below but could not find a satisfactory answer to my question-
It is the XmlWriter that is verifying that valid characters are being written. In the official documentation, the relevant XmlWriter configuration is described in the Data Conformance section:
Yes, with the CheckCharacters flag set to true, it will guarantee an exception is thrown when it enounters an illegal character.
If you want to allow writing invalid characters, the CheckCharacters flag can be set to false in the XmlWriterSettings for your XmlWriter, which would prevent the exception from being thrown. Normally, the XmlWriter will encode reserved characters as character entities (e.g.
<
to<
). Additionally, with the flag set to false, the XmlWriter will escape illegal characters as numeric character entities (e.g.\f
to
) to produce text that conforms to the XML specification.