I know that all arrays in .net are limited to 2 GB, under this premise, I try not to allocate more that n = ((2^31) - 1) / 8 doubles in an array. Nevertheless, that number of elements still doesn't seem to be valid. Anyone knows how can I determine at run time the maximum number of elements given sizeof(T)?
I know that whatever quantity approaching that number is just a lot of elements but, for all intents and purposes, let's say I need it.
Note: I'm in a 64-bit environment, with a target platform for my application of AnyCPU, and at least 3100 MB free in RAM.
Update: Thank you all for your contributions and sorry I was so quiet. I apologise for the inconvenience. I have not been able to rephrase my question but I can add that, what I am looking for is solving something like this:
template <class T>
array<T>^ allocateAnUsableArrayWithTheMostElementsPossible(){
return gcnew array<T>( ... );
}
The results in my own answer are kinda satisfactory but not good enough. Furthermore, I haven't test it on another machine (Kind of hard finding another machine with more than 4 GB). Besides, I have been doing some research on my own and it seems there's no cheap way to calculate this at run time. Anyhow, that was just a plus, none of the user of what-I-am-trying-to-accomplish can expect to use the feature I am trying to implement without having the capacity.
So, in other words, I just want to understand why the maximum number of elements of an array don't add up to 2GB ceteris paribus. A top maximum is all I need for now.
Update: answer COMPLETELY rewritten. Original answer contained methods to find the largest possible addressable array on any system by divide and conquer, see history of this answer if you're interested. The new answer attempts to explain the 56 bytes gap.
In his own answer, AZ explained that the maximum array size is limited to less then the 2GB cap and with some trial and error (or another method?) finds the following (summary):
I'm not entirely sure about the 16 bytes and 32 bytes situations. The total available size for the array might be different if it's an array of structs or a build-in type. I'll emphasize on 1-8 bytes type size (of which I'm not that sure either, see conclusion).
Data layout of an array
To understand why the CLR does not allow exactly
2GB / IntPtr.Size
elements we need to know how an array is structured. A good starting point is this SO article, but unfortunately, some of the information seems false, or at least incomplete. This in-depth article on how the .NET CLR creates runtime objects proved invaluable, as well as this Arrays Undocumented article on CodeProject.Taking all the information in these articles, it comes down to the following layout for an array in 32 bit systems:
Each part is one system
DWORD
in size. On 64 bit windows, this looks as follows:The layout looks slightly different when it's an array of objects (i.e., strings, class instances). As you can see, the type handle to the object in the array is added.
Looking further, we find that a built-in type, or actually, any struct type, gets its own specific type handler (all
uint
share the same, but anint
has a different type handler for the array then auint
orbyte
). All arrays of object share the same type handler, but have an extra field that points to the type handler of the objects.A note on struct types: padding may not always be applied, which may make it hard to predict the actual size of a struct.
Still not 56 bytes...
To count towards the 56 bytes of the AZ's answer, I have to make a few assumptions. I assume that:
A syncblock is placed before the address the variable points at, which makes it look like it's not part of the object. But in fact, I believe it is and it counts towards the internal 2GB limit. Adding all these, we get, for 64 bit systems:
Not 56 yet. Perhaps someone can have a look with Memory View during debugging to check how the layout of an array looks like under 64 bits windows.
My guess is something along these lines (take your pick, mix and match):
2GB will never be possible, as that is one byte into the next segment. The largest block should be
2GB - sizeof(int)
. But this is silly, as mem indexes should start at zero, not one;Any object larger then 85016 bytes will be put on the LOH (large object heap). This may include an extra pointer, or even a 16 byte struct holding LOH information. Perhaps this counts towards the limit;
Aligning: assuming the objectref does not count (it is in another mem segment anyway), the total gap is 32 bytes. It's very well possible that the system prefers 32 byte boundaries. Take a new look at the memory layout. If the starting point needs to be on a 32 byte boundary, and it needs room for the syncblock before it, the syncblock will end up in the end of the first 32 bytes block. Something like this:
where
XXX..
stands for skipped bytes.multi dimensional arrays: if you create your arrays dynamically with
Array.CreateInstance
with 1 or more dimensions, a single dim array will be created with two extra DWORDS containing the size and the lowerbound of the dimension (even if you have only one dimension, but only if the lowerbound is specified as non-zero). I find this highly unlikely, as you would probably have mentioned this if this were the case in your code. But it would bring the total to 56 bytes overhead ;).Conclusion
From all I gathered during this little research, I think that the
Overhead + Aligning - Objectref
is the most likely and most fitting conclusion. However, a "real" CLR guru might be able to shed some extra light on this peculiar subject.None of these conclusions explain why 16 or 32 byte datatypes have a 48 and 32 byte gap respectively.
Thanks for a challenging subject, learned something along my way. Perhaps some people can take the downvote off when they find this new answer more related to the question (which I originally misunderstood, and apologies for the clutter this may have caused).