I'm trying to allocate memory for hundreds of thousands objects to initialize them later from an array of bytes. My goal is to skip memory allocation on each object. That is why I am using C# structs.
Union:
[StructLayout(LayoutKind.Explicit)]
struct HeaderUnion
{
[FieldOffset(0)]
public unsafe fixed char Data[8];
[FieldOffset(0)]
public HeaderSeq HeaderSeq;
}
HeaderSeq:
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)]
struct HeaderSeq
{
[MarshalAs(UnmanagedType.ByValTStr,SizeConst = 2)]
public string FirstName;
[MarshalAs(UnmanagedType.ByValTStr,SizeConst = 2)]
public string LastName;
}
In the program I want to write:
var bytes = File.ReadAllBytes("image.dat");
var headerUnions = new HeaderUnion[100000];
var size = Marshal.SizeOf<HeaderSeq>();
for (int i = 0; i < 100000; i++)
{
var positionIndex = i * size;
unsafe
{
fixed (char* charPtr = headerUnions[i].Data)
{
Marshal.Copy(bytes, positionIndex, (IntPtr)charPtr, size);
}
}
}
However, it gives me the runtime error:
Unhandled exception. System.TypeLoadException: Could not load type 'HeaderUnion' from assembly 'MyProject.Console, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null' because it contains an object field at offset 0 that is incorrectly aligned or overlapped by a non-object field.
If I redefine HeaderSeq to only contain single-char fields instead of string or char[], it works fine.
If I do not include HeaderSeq to the HeaderUnion as a field, it also works fine but I should rewrite my code to utilize Marshal.PtrToStructure:
// It does not suit me, because it allocates a struct on each cycle.
// Instead, I want all the memory to be pre-allocated already in the fixed buffers.
var headerSeq = Marshal.PtrToStructure<HeaderSeq>((IntPtr) charPtr);
I can think of preallocating arrays of char in the managed heap, and storing only indexes in the HeaderSeq. This approach is less elegant for me.
Is my goal achievable at all in C#? How should I define my structs?
There is no way to do this with managed types (
string), you're going to have to usechar[]orchar*instead.On an unrelated note, I'm not sure why but the
Datafield takes up 16 bytes rather than 8 (according tosizeofon .NET 6 x86). Changing it to a byte array fixes that. There's no practical difference and it would have you allocate less memory.EDIT: After looking into the second bit a little more, it seems C#'s chars are 16 bit (i.e. Unicode), but automatically get marshalled as 8 bit, which confused me. If you plan on using Unicode, chars are desired.