We have an application that is continuously reading huge amount of data from network. We identified GC (even the cumulative effect of gen0 collections. We used ETW collecting to support our findings) as the biggest bottleneck, and so we try to perform memory pooling to avoid any collections from kicking in.

We can preallocate huge byte array to continuously read from network without allocations, we can perform same with char arrays (to avoid allocations during conversion with Encoding class), however there doesn't seem to be a way how to convert to basic types (int, decimal ...) without reinventing wheel (means reimplementing what the BCL does in TryParse methods) or without creating garbage (converting char[] to strings which are to be thrown away).

So here are my questions:

  • Is it possible to somehow inject a char array into the string, or otherwise force the string to allocate memory from reusable pool of memory? I was looking into reflected internals of string and it seems to be an impossible task, but I would welcome any suggestion

OR

  • Is it possible to leverage some standard conversion functions to convert to basic types from char[] (or other textual, but not System.String form)? Again - I was looking into reflected code of System.Number - it looks like the undercover functions takes char *, and so it would be possible to call them via reflection; DateTime conversions however still uses strings.

Any suggestions are welcomed.

1

There are 1 best solutions below

2
On

If unsafe code is a viable alternative for your application then you can rewrite the contents and length of a string. This will allow you to have a pool of reusable strings which can be preallocated, thus avoiding garbage collection.

A C# string is laid out in memory like this:

int Capacity;
int Length;
char FirstCharacter;
// remaining characters follow

The character data is null-terminated (for ease of inter-operation with unmanaged C/C++ code), and current length and max capacity are also stored, to enable those pesky buffer overrun problems to be avoided.

Here is how to inject new contents into an existing string without allocating any new memory:

    static unsafe void RecycleString(string s, char[] newcontents)
    {
        // First, fix the string so the GC doesn't move it around on us, and get a pointer to the character data.
        fixed (char* ps = s)
        {
            // We need an integer pointer as well, to check capacity and update length.
            int* psi = (int*)ps;
            int capacity = psi[-2];

            // Don't overrun the buffer!
            System.Diagnostics.Debug.Assert(capacity > newcontents.Length);
            if (capacity > newcontents.Length)
            {
                for (int i = 0; i < newcontents.Length; ++i)
                {
                    ps[i] = newcontents[i];
                }

                // Add null terminator and update length accordingly.
                ps[newcontents.Length] = '\0';
                psi[-1] = newcontents.Length;
            }
        }
    }

With that in place, you can recycle and re-parse the same string to your heart's content. Here's a simple example to demonstrate:

    private static void ReusableStringTest()
    {
        char[] intFromWire = new char[] { '9', '0', '0', '0' };
        char[] floatFromWire = new char[] { '3', '.', '1', '4', '1', '5' };

        string reusableBuffer = new string('\0', 128);

        RecycleString(reusableBuffer, intFromWire);
        int i = Int32.Parse(reusableBuffer);
        Console.WriteLine("Parsed integer {0}", i);

        RecycleString(reusableBuffer, floatFromWire);
        float f = Single.Parse(reusableBuffer);
        Console.WriteLine("Parsed float {0}", f);
    }

The generated output is as one would hope:

Parsed integer 9000
Parsed float 3.1415

And if unsafe code makes you nervous, just remember all those years we spent programming in C and C++, when everything was unsafe!