I am trying to download a webpage using async and await and HttpClient, but am getting only a string full of special characters... Code is like..
static async void DownloadPageAsync(string url)
{
HttpClient client = new HttpClient();
client.DefaultRequestHeaders.TryAddWithoutValidation("Accept", "text/html,application/xhtml+xml,application/xml");
client.DefaultRequestHeaders.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate");
client.DefaultRequestHeaders.TryAddWithoutValidation("User-Agent", "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0");
client.DefaultRequestHeaders.TryAddWithoutValidation("Accept-Charset", "ISO-8859-1");
HttpResponseMessage response = await client.GetAsync(url);
response.EnsureSuccessStatusCode();
var responseStream = await response.Content.ReadAsStreamAsync();
var streamReader = new StreamReader(responseStream);
var str = streamReader.ReadToEnd();
}
and url is
url = @"http://www.nseindia.com/live_market/dynaContent/live_watch/live_index_watch.htm";
When i did
client.DefaultRequestHeaders.Add("User-Agent",
"Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2;
WOW64; Trident/6.0)");
in place of those four DefaultRequestHeaders, I got a 403 error, but this is nse site and is free for all. Please help friends get me correct response.. regards
Srivastava
With this you tell the server that you allow it to compress the response gzip/deflate. So the response is actually compressed which explains why you get the kind of response text you get.
If you want plain text, you shouldn’t add the header, so the server won’t compress the response. If you remove above line, you get a normal HTML response text.
Alternatively, you can of course keep that header in and decompress the response using GZipStream after receiving it. That would work like this:
Ideally, you should check the value of
response.Content.Headers.GetValues("Content-Encoding")
to make sure that the encoding isgzip
. Since you also accepteddeflate
as a possible encoding, you could then use DeflateStream to decode that; or don’t decode anything in case the Content-Encoding header is missing.