Skipping Html Content in Tag attributes

472 Views Asked by At

I am using SAX Parser to parse following piece of data with "Description" attribute containing HTML content . But I am getting error "The value of attribute "Description" associated with an element type "null" must not contain the '<' character".

How to make SAX Parser ignore this tag while XML Processing?

<Thread ThreadID="22" Title="google"
                    Description="<a href="http://google.com/">http://google.com/</a>"
                    DisplayName="Sam" LoginID="hjaja" UserEmailID="abx@ers"
                    UserSapCode="12345"
                    IsAnonymous="Yes" CreatedDate="2015-04-29T21:56:04.943" ReplyCount="0"
                    ViewCount="0" PopularityPoints="0" LastUpdatedBy="" LastPostDate="" />

Thanks in advance.

2

There are 2 best solutions below

0
On BEST ANSWER

I really thing that you should take a look at this post (HTML code inside XML) to see how other people recommended to tackle such problem.

0
On

No XML parser can parse this data as the data do not comply the xml format. Please refer XML specifications.

There are two ways you can solve this:

  1. Change the source format

Change the source to create the proper XML. You can include HTMLs by escaping the characters using these:

"   &quot;
'   &apos;
<   &lt;
>   &gt;
&   &amp;
  1. Change the target algo

Second is by creating your own parsing algorithm for you case.

Usually answer is always the the first one.