Consider the following program, with all of HttpRequestMessage, and HttpResponseMessage, and HttpClient disposed properly. It always ends up with about 50MB memory at the end, after collection. Add a zero to the number of requests, and the un-reclaimed memory doubles.
class Program
{
static void Main(string[] args)
{
var client = new HttpClient {
BaseAddress = new Uri("http://localhost:5000/")};
var t = Task.Run(async () =>
{
var resps = new List<Task<HttpResponseMessage>>();
var postProcessing = new List<Task>();
for (int i = 0; i < 10000; i++)
{
Console.WriteLine("Firing..");
var req = new HttpRequestMessage(HttpMethod.Get,
"test/delay/5");
var tsk = client.SendAsync(req);
resps.Add(tsk);
postProcessing.Add(tsk.ContinueWith(async ts =>
{
req.Dispose();
var resp = ts.Result;
var content = await resp.Content.ReadAsStringAsync();
resp.Dispose();
Console.WriteLine(content);
}));
}
await Task.WhenAll(resps);
resps.Clear();
Console.WriteLine("All requests done.");
await Task.WhenAll(postProcessing);
postProcessing.Clear();
Console.WriteLine("All postprocessing done.");
});
t.Wait();
Console.Clear();
var t2 = Task.Run(async () =>
{
var resps = new List<Task<HttpResponseMessage>>();
var postProcessing = new List<Task>();
for (int i = 0; i < 10000; i++)
{
Console.WriteLine("Firing..");
var req = new HttpRequestMessage(HttpMethod.Get,
"test/delay/5");
var tsk = client.SendAsync(req);
resps.Add(tsk);
postProcessing.Add(tsk.ContinueWith(async ts =>
{
var resp = ts.Result;
var content = await resp.Content.ReadAsStringAsync();
Console.WriteLine(content);
}));
}
await Task.WhenAll(resps);
resps.Clear();
Console.WriteLine("All requests done.");
await Task.WhenAll(postProcessing);
postProcessing.Clear();
Console.WriteLine("All postprocessing done.");
});
t2.Wait();
Console.Clear();
client.Dispose();
GC.Collect();
Console.WriteLine("Done");
Console.ReadLine();
}
}
On a quick investigation with a memory profiler, it seems that the objects that take up the memory are all of the type Node<Object> inside mscorlib.
My initial though was that, it was some internal dictionary or a stack, since they are the types that uses Node as an internal structure, but I was unable to turn up any results for a generic Node<T> in the reference source since this is actually Node<object> type.
Is this a bug, or somekind of expected optimization (I wouldn't consider a proportional consumption of memory always retained to be a optimization in any way)? And purely academic, what is the Node<Object>.
Any help in understanding this would be much appreciated. Thanks :)
Update: To extrapolate the results for a much larger test set, I optimized it slightly by throttling it.
Here's the changed program. And now, it seems to stay consistent at 60-70MB, for a 1 million request set. I'm still baffled at what those Node<object>s really are, and its allowed to maintain such a high number of irreclaimable objects.
And the logical conclusion from the differences in these two results leads me to guess, this may not really be an issue in with HttpClient or WebRequest, rather something rooted directly with async - Since the real variant in these two test are the number of incomplete async tasks that exist at a given point in time. This is merely a speculation from the quick inspection.
static void Main(string[] args)
{
Console.WriteLine("Ready to start.");
Console.ReadLine();
var client = new HttpClient { BaseAddress =
new Uri("http://localhost:5000/") };
var t = Task.Run(async () =>
{
var resps = new List<Task<HttpResponseMessage>>();
var postProcessing = new List<Task>();
for (int i = 0; i < 1000000; i++)
{
//Console.WriteLine("Firing..");
var req = new HttpRequestMessage(HttpMethod.Get, "test/delay/5");
var tsk = client.SendAsync(req);
resps.Add(tsk);
var n = i;
postProcessing.Add(tsk.ContinueWith(async ts =>
{
var resp = ts.Result;
var content = await resp.Content.ReadAsStringAsync();
if (n%1000 == 0)
{
Console.WriteLine("Requests processed: " + n);
}
//Console.WriteLine(content);
}));
if (n%20000 == 0)
{
await Task.WhenAll(resps);
resps.Clear();
}
}
await Task.WhenAll(resps);
resps.Clear();
Console.WriteLine("All requests done.");
await Task.WhenAll(postProcessing);
postProcessing.Clear();
Console.WriteLine("All postprocessing done.");
});
t.Wait();
Console.Clear();
client.Dispose();
GC.Collect();
Console.WriteLine("Done");
Console.ReadLine();
}




Let’s investigate the problem with all the tools we have in hand.
First, let’s take a look at what those objects are, in order to do that, I put the given code in Visual Studio and created a simple console application. Side-by-side I run a simple HTTP server on Node.js to serve the requests.
Run the client to the end and start attaching WinDBG to it, I inspect the managed heap and get these results:
The !dumpheap command dumps all objects in the managed heap there. That could include objects that should be freed (but not yet because GC has not kicked in yet). In our case, that should be rare because we just called GC.Collect() before the print out and nothing else should run after the print out.
Worth notice is the specific line above. That should be the Node object you are referring to in the question.
Next, let’s look at the individual objects of that type, we grab the MT value of that object and then invoke !dumpheap again like this, this will filter out only the objects we are interested in.
Now grabbing a random one in the list, and then asks the debugger why this object is still on the heap by invoking the !gcroot command as follow:
Now it is quite obvious that we have a cache, and that cache maintain a stack, with the stack implemented as a linked list. If we ponder further we will see in the reference source, how that list is used. To do that, let’s first inspect the cache object itself, using !DumpObj
Now we see something interesting, the stack is actually used as a free list for the cache. The source code tells us how the free list is used, in particular, in the Free() method shown below:
http://referencesource.microsoft.com/#mscorlib/parent/parent/parent/parent/InternalApis/NDP_Common/inc/PinnableBufferCache.cs
So that is it, when the caller is done with the buffer, it returns to the cache, the cache then put that in the free list, the free list is then used for allocation purpose
Last but not least, let’s understand why the cache itself is not freed when we are done with all those HTTP requests? Here is why. By adding a breakpoint on mscorlib.dll!System.Collections.Concurrent.ConcurrentStack.Push(), we see the following call stack (well, this could be just one of the cache use case, but this is representative)
At WriteHeadersCallback, we are done with writing the headers, so we return the buffer to the cache. At this point the buffer is pushed back to the free list, and therefore we allocate a new stack node. The key thing to notice is that the cache object is a static member of HttpWebRequest.
http://referencesource.microsoft.com/#System/net/System/Net/HttpWebRequest.cs
So there we go, the cache is shared across all requests and is not released when all requests are done.