In C# there are now multiple ways to create a temporary collection.

You can just new an array and hope the garbage collector doesn't mind.

// Just create a new array
Span<int> array = new int[]{ 1, 2 };

In an unsafe context you can use stackalloc to create a temporary collection on the stack, reducing the load on the Garbage collector.

// Create a new array on the stack
Span<int> stack = stackalloc int[2];
stack[0] = 1;
stack[1] = 2;

But now with latest C# version you can also use a collection expression.

Span<int> span = [1, 2];

I'm a bit confused on how the new collection expression works. Especially now Visual Studio suggest to convert the stackalloc to a collection expression. Which would suggest they are equivalent?

In general I don't just new an array in a performance critical section and tend to use the stackalloc. The new syntax for collection expressions is extremely convenient. But I can't really figure out how it works under the hood.

The C# specification suggests that when assigned to a Span the collection expression might become a stackalloc. But I'm unsure if/how this is currently implemented. I've tried looking at the IL representation in SharpLab gist. I'm not really used to looking at IL. But the collection expression and stackalloc definitely do not have the same IL representation. Does this suggest that it currently does not translate collection expressions to stackallocs?

// stackalloc
ldc.i4.8
conv.u
localloc
ldc.i4.2
newobj instance void valuetype [System.Runtime]System.Span`1<int32>::.ctor(void*, int32)
stloc.s 4
ldloc.s 4
stloc.1
ldloca.s 1
ldc.i4.0
call instance !0& valuetype [System.Runtime]System.Span`1<int32>::get_Item(int32)
ldc.i4.1
stind.i4
ldloca.s 1
ldc.i4.1
call instance !0& valuetype [System.Runtime]System.Span`1<int32>::get_Item(int32)
ldc.i4.2
stind.i4

// collection expressions
ldloca.s 3
initobj valuetype '<>y__InlineArray2`1'<int32>
ldloca.s 3
ldc.i4.0
call !!1& '<PrivateImplementationDetails>'::InlineArrayElementRef<valuetype '<>y__InlineArray2`1'<int32>, int32>(!!0&, int32)
ldc.i4.1
stind.i4
ldloca.s 3
ldc.i4.1
call !!1& '<PrivateImplementationDetails>'::InlineArrayElementRef<valuetype '<>y__InlineArray2`1'<int32>, int32>(!!0&, int32)
ldc.i4.2
stind.i4
ldloca.s 3
ldc.i4.2
call valuetype [System.Runtime]System.Span`1<!!1> '<PrivateImplementationDetails>'::InlineArrayAsSpan<valuetype '<>y__InlineArray2`1'<int32>, int32>(!!0&, int32)
stloc.2

Does anybody have a better understanding of how the new collection expression works under the hood and what the performance characteristics are, especially if it generates garbage?

1

There are 1 best solutions below

4
Guru Stron On BEST ANSWER

I'm not really used to looking at IL.

And you don't need to (in this case), check out the C# decompilation.


The following is an implementation detail and is subject to change.


The decompilation looks something like the following:

<>y__InlineArray2<int> buffer = default(<>y__InlineArray2<int>);
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray2<int>, int>(ref buffer, 0) = 1;
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray2<int>, int>(ref buffer, 1) = 2;
Span<int> span = <PrivateImplementationDetails>.InlineArrayAsSpan<<>y__InlineArray2<int>, int>(ref buffer, 2);

[CompilerGenerated]
internal sealed class <PrivateImplementationDetails>
{
    internal static Span<TElement> InlineArrayAsSpan<TBuffer, TElement>(ref TBuffer buffer, int length)
    {
        return MemoryMarshal.CreateSpan(ref Unsafe.As<TBuffer, TElement>(ref buffer), length);
    }

    internal static ref TElement InlineArrayElementRef<TBuffer, TElement>(ref TBuffer buffer, int index)
    {
        return ref Unsafe.Add(ref Unsafe.As<TBuffer, TElement>(ref buffer), index);
    }
}

[StructLayout(LayoutKind.Auto)]
[InlineArray(2)]
internal struct <>y__InlineArray2<T>
{
    [CompilerGenerated]
    private T _element0;
}

The main point of interest here is the generated <>y__InlineArray2<T> marked with the InlineArrayAttribute. This is a new feature introduced with C# 12 - inline arrays.

Inline arrays are used by the runtime team and other library authors to improve performance in your apps. Inline arrays enable a developer to create an array of fixed size in a struct type. A struct with an inline buffer should provide performance characteristics similar to an unsafe fixed size buffer.
You likely won't declare your own inline arrays, but you use them transparently when they're exposed as System.Span<T> or `System.ReadOnlySpan objects from runtime APIs.

Basically it is a special type allocated on stack which should not result in extra heap allocations.

Also when talking about performance the most important thing is to actually measure it. For example I have made next small synthetic benchmark (will try to check out how meaningful it is a bit later =):

[MemoryDiagnoser]
public class CollectionBench
{
    [Benchmark]
    public int NewArray()
    {
        Span<int> array = new int[]{ 1, 2 };

        return  Math.Max(array[0], array[1]);
    }
    
    [Benchmark]
    public unsafe int Stackalloc()
    {
        Span<int> stack = stackalloc int[2];
        stack[0] = 1;
        stack[1] = 2;

        return  Math.Max(stack[0], stack[1]);
    }
    
    [Benchmark]
    public int CollectionExpr()
    {
        Span<int> span = [1, 2];

        return Math.Max(span[0], span[1]);
    }
}

Which gives "on my machine" (via BenchmarkDotNet):

Method Mean Error StdDev Gen0 Allocated
NewArray 3.4909 ns 0.0931 ns 0.1243 ns 0.0038 32 B
Stackalloc 0.9410 ns 0.0105 ns 0.0144 ns - -
CollectionExpr 0.2421 ns 0.0219 ns 0.0321 ns - -

As you can see no extra allocations were made for collection expression. But be sure to measure against your actual code/data/hardware.