How to deserialize a json string containing encoded HTML into JObject?

1.7k Views Asked by At

I have a json string containing encoded HTML as below which I get after doing a Shopify Liquid escape. I am trying to decode the internal HTML and deserialize this string into a JObject.

{
    "testId": 494254,
    "languageIdentifier": "en_us",
    "overview":"<p style="margin: 0px;"><span>Overview'ff' Test</span></p>",  
    "responsibilities":"<p style="margin: 0px;"><span>Responsibilities</span></p>",
    "qualifications":"<p style="margin: 0px;"><span>Qualifications</span></p>",
    "guidance":"z_used_Guidance Test",
    "additionalDetailsForInternalCandidates":"<p style="margin: 0px;"><span>Additional Details</span></p>",
    "requisitionNotes":"Requisition Notes"
}

The actual html is:

{
    "testId": 494254,
    "languageIdentifier": "en_us",
    "overview": "<p style=\"margin: 0px;\"><span>Overview'ff' Test</span></p>"
    "responsibilities": "<p style=\"margin: 0px;\"><span>Responsibilities</span></p>"
    "qualifications": "<p style=\"margin: 0px;\"><span>Qualifications</span></p>"
    "additionalDetailsForInternalCandidates":"<p style=\"margin: 0px;\"><span>Additional Details</span></p>",
    "guidance":"z_used_Guidance Test",
    "requisitionNotes":"Requisition Notes"
}

However, when I try to decode and deserialize, it is failing. My code is:

string test = "{\r\n\t\"testId\": 494254,\r\n\t\"languageIdentifier\": \"en_us\",\r\n\t\"overview\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Overview&#39;ff&#39; Test&lt;/span&gt;&lt;/p&gt;\",\t\r\n\t\"responsibilities\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Responsibilities&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"qualifications\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Qualifications&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"guidance\":\"z_used_Guidance Test\",\r\n\t\"additionalDetailsForInternalCandidates\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Additional Details&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"requisitionNotes\":\"Requisition Notes\"}";
var jsonString = HttpUtility.HtmlDecode(test);
var objectjson = JsonConvert.DeserializeObject<JObject>(jsonString);

However, I get this error:

'After parsing a value an unexpected character was encountered: m. Path 'overview', line 4, position 23.'

Can anyone help me in decoding an encoded HTML and deserialize it into JObject? I want it as a decoded html string like the original input

{
    "testId": 494254,
    "languageIdentifier": "en_us",
    "overview": "<p style=\"margin: 0px;\"><span>Overview'ff' Test</span></p>",
    "responsibilities": "<p style=\"margin: 0px;\"><span>Responsibilities</span></p>",
    "qualifications": "<p style=\"margin: 0px;\"><span>Qualifications</span></p>",
    "additionalDetailsForInternalCandidates":"<p style=\"margin: 0px;\"><span>Additional Details</span></p>",
    "guidance":"z_used_Guidance Test",
    "requisitionNotes":"Requisition Notes"
}

Thanks in advance.

2

There are 2 best solutions below

0
Heinzi On BEST ANSWER

You assume that you have an HTML-encoded JSON file. That is not the case. What you have is a JSON file containing HTML-encoded values. That's something different.

It means that you first need to parse the JSON and then HTML-decode the value.

string jsonString = @"{
    ""testId"": 494254,
    ""languageIdentifier"": ""en_us"",
    ""overview"":""&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Overview&#39;ff&#39; Test&lt;/span&gt;&lt;/p&gt;"",  
    ""responsibilities"":""&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Responsibilities&lt;/span&gt;&lt;/p&gt;"",
    ""qualifications"":""&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Qualifications&lt;/span&gt;&lt;/p&gt;"",
    ""guidance"":""z_used_Guidance Test"",
    ""additionalDetailsForInternalCandidates"":""&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Additional Details&lt;/span&gt;&lt;/p&gt;"",
    ""requisitionNotes"":""Requisition Notes""
}";

var objectjson = JsonConvert.DeserializeObject<JObject>(jsonString);
var htmlEncodedValue = objectjson.Value<string>("overview");
var decodedValue = HttpUtility.HtmlDecode(htmlEncodedValue);
0
Vivek Chhatrala On

You have to do as per below code in C#

  1. libraries

    using Newtonsoft.Json;
    using Newtonsoft.Json.Linq;
    
  2. code

    string test = "{\r\n\t\"testId\": 494254,\r\n\t\"languageIdentifier\": \"en_us\",\r\n\t\"overview\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Overview Test&lt;/span&gt;&lt;/p&gt;\",\t\r\n\t\"responsibilities\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Responsibilities&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"qualifications\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Qualifications&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"guidance\":\"z_used_Guidance Test\",\r\n\t\"additionalDetailsForInternalCandidates\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Additional Details&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"requisitionNotes\":\"Requisition Notes\"}";
    
    JToken objectData = JToken.Parse(test);
    var testId = objectData["testId"];
    var languageIdentifier = objectData["languageIdentifier"];
    
  3. If you want to do it from html instead of c# then do as per below.

    <!DOCTYPE html>
    <html>
    <body>
    <h2>Create Object from JSON String</h2>
    <p id="demo"></p>
    
    <script>
    var txt = "{\r\n\t\"testId\": 494254,\r\n\t\"languageIdentifier\": \"en_us\",\r\n\t\"overview\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Overview Test&lt;/span&gt;&lt;/p&gt;\",\t\r\n\t\"responsibilities\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Responsibilities&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"qualifications\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Qualifications&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"guidance\":\"z_used_Guidance Test\",\r\n\t\"additionalDetailsForInternalCandidates\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Additional Details&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"requisitionNotes\":\"Requisition Notes\"}"
    var obj = JSON.parse(txt);
    document.getElementById("demo").innerHTML = "TestID : " + obj.testId + " and languageIdentifier : " + obj.languageIdentifier;
    </script>
    </body>
    </html>