I'm getting JSON from a webservice with encoded characters: \u201c
, etc. As I'm parsing it works perfectly: double quotes inside texts have the encoded character value, while control double quotes are not encoded, so the parser see the right JSON structure. The problem is after I write it to a file and read it, it spoils the JSON. I no longer have \u201c
, but "
characters inside content texts.
- If I encode it with utf-8,
"
are changed to the File Separator (28
) character and-
is changed to Control Device 3 (0x13
) and results in a parsing exception. - If I encode it with ascii,
"
are changed to?
character. - If I encode it with iso-8859-1,
"
stays decoded"
.
Is there any way to preserve the unencoded characters after writing and reading?
SAMPLE:
I'm using Newtonsoft.Json.Linq
Encoding encoding = Encoding.GetEncoding("ISO-8859-1");
webResponse = (HttpWebResponse)webRequest.GetResponse();
using (StreamReader streamReader = new StreamReader(webResponse.GetResponseStream(), encoding))
{
responseString = streamReader.ReadToEnd();
}
JToken json = JObject.Parse(responseString);
using (StreamWriter stream = new StreamWriter(path, true, encoding))
{
stream.Write(json.ToString());
}
string spoiledJsonString = File.ReadAllText(path, encoding);
JToken sureNotToBeCreated = JObject.Parse(spoiledJsonString); // EXCEPTION
If I write the test program,
I recreate your problem, the quote has been escaped.
If I change the encoding to
Encoding.UTF8
, it works successfully.As supported here, ISO-8859-1 is not a Unicode charset so is a bad choice for encoding Unicode.
As supported here, JSON text is Unicode.
So we can deduce, ISO-8859-1 is a bad choice for encoding JSON strings.
The program,
runs without warning, so I suspect you have some other issue than UTF-8.