What is the most efficient way to create a temporary collection? Differences between stackalloc, and collection expressions?

Question

What is the most efficient way to create a temporary collection? Differences between stackalloc, and collection expressions?

136 Views Asked by Roy T. At 06 March 2024 at 09:56

In C# there are now multiple ways to create a temporary collection.

You can just new an array and hope the garbage collector doesn't mind.

// Just create a new array
Span<int> array = new int[]{ 1, 2 };

In an unsafe context you can use stackalloc to create a temporary collection on the stack, reducing the load on the Garbage collector.

// Create a new array on the stack
Span<int> stack = stackalloc int[2];
stack[0] = 1;
stack[1] = 2;

But now with latest C# version you can also use a collection expression.

Span<int> span = [1, 2];

I'm a bit confused on how the new collection expression works. Especially now Visual Studio suggest to convert the stackalloc to a collection expression. Which would suggest they are equivalent?

In general I don't just new an array in a performance critical section and tend to use the stackalloc. The new syntax for collection expressions is extremely convenient. But I can't really figure out how it works under the hood.

The C# specification suggests that when assigned to a Span the collection expression might become a stackalloc. But I'm unsure if/how this is currently implemented. I've tried looking at the IL representation in SharpLab gist. I'm not really used to looking at IL. But the collection expression and stackalloc definitely do not have the same IL representation. Does this suggest that it currently does not translate collection expressions to stackallocs?

// stackalloc
ldc.i4.8
conv.u
localloc
ldc.i4.2
newobj instance void valuetype [System.Runtime]System.Span`1<int32>::.ctor(void*, int32)
stloc.s 4
ldloc.s 4
stloc.1
ldloca.s 1
ldc.i4.0
call instance !0& valuetype [System.Runtime]System.Span`1<int32>::get_Item(int32)
ldc.i4.1
stind.i4
ldloca.s 1
ldc.i4.1
call instance !0& valuetype [System.Runtime]System.Span`1<int32>::get_Item(int32)
ldc.i4.2
stind.i4

// collection expressions
ldloca.s 3
initobj valuetype '<>y__InlineArray2`1'<int32>
ldloca.s 3
ldc.i4.0
call !!1& '<PrivateImplementationDetails>'::InlineArrayElementRef<valuetype '<>y__InlineArray2`1'<int32>, int32>(!!0&, int32)
ldc.i4.1
stind.i4
ldloca.s 3
ldc.i4.1
call !!1& '<PrivateImplementationDetails>'::InlineArrayElementRef<valuetype '<>y__InlineArray2`1'<int32>, int32>(!!0&, int32)
ldc.i4.2
stind.i4
ldloca.s 3
ldc.i4.2
call valuetype [System.Runtime]System.Span`1<!!1> '<PrivateImplementationDetails>'::InlineArrayAsSpan<valuetype '<>y__InlineArray2`1'<int32>, int32>(!!0&, int32)
stloc.2

Does anybody have a better understanding of how the new collection expression works under the hood and what the performance characteristics are, especially if it generates garbage?

Original Q&A

There are 1 best solutions below

**Guru Stron** · Accepted Answer · 2024-03-06T10:10:26.960000

I'm not really used to looking at IL.

And you don't need to (in this case), check out the C# decompilation.

The following is an implementation detail and is subject to change.

The decompilation looks something like the following:

<>y__InlineArray2<int> buffer = default(<>y__InlineArray2<int>);
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray2<int>, int>(ref buffer, 0) = 1;
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray2<int>, int>(ref buffer, 1) = 2;
Span<int> span = <PrivateImplementationDetails>.InlineArrayAsSpan<<>y__InlineArray2<int>, int>(ref buffer, 2);

[CompilerGenerated]
internal sealed class <PrivateImplementationDetails>
{
    internal static Span<TElement> InlineArrayAsSpan<TBuffer, TElement>(ref TBuffer buffer, int length)
    {
        return MemoryMarshal.CreateSpan(ref Unsafe.As<TBuffer, TElement>(ref buffer), length);
    }

    internal static ref TElement InlineArrayElementRef<TBuffer, TElement>(ref TBuffer buffer, int index)
    {
        return ref Unsafe.Add(ref Unsafe.As<TBuffer, TElement>(ref buffer), index);
    }
}

[StructLayout(LayoutKind.Auto)]
[InlineArray(2)]
internal struct <>y__InlineArray2<T>
{
    [CompilerGenerated]
    private T _element0;
}

The main point of interest here is the generated <>y__InlineArray2<T> marked with the InlineArrayAttribute. This is a new feature introduced with C# 12 - inline arrays.

Inline arrays are used by the runtime team and other library authors to improve performance in your apps. Inline arrays enable a developer to create an array of fixed size in a struct type. A struct with an inline buffer should provide performance characteristics similar to an unsafe fixed size buffer.
You likely won't declare your own inline arrays, but you use them transparently when they're exposed as System.Span<T> or `System.ReadOnlySpan objects from runtime APIs.

Basically it is a special type allocated on stack which should not result in extra heap allocations.

Also when talking about performance the most important thing is to actually measure it. For example I have made next small synthetic benchmark (will try to check out how meaningful it is a bit later =):

[MemoryDiagnoser]
public class CollectionBench
{
    [Benchmark]
    public int NewArray()
    {
        Span<int> array = new int[]{ 1, 2 };

        return  Math.Max(array[0], array[1]);
    }
    
    [Benchmark]
    public unsafe int Stackalloc()
    {
        Span<int> stack = stackalloc int[2];
        stack[0] = 1;
        stack[1] = 2;

        return  Math.Max(stack[0], stack[1]);
    }
    
    [Benchmark]
    public int CollectionExpr()
    {
        Span<int> span = [1, 2];

        return Math.Max(span[0], span[1]);
    }
}

Which gives "on my machine" (via BenchmarkDotNet):

Method	Mean	Error	StdDev	Gen0	Allocated
NewArray	3.4909 ns	0.0931 ns	0.1243 ns	0.0038	32 B
Stackalloc	0.9410 ns	0.0105 ns	0.0144 ns	-	-
CollectionExpr	0.2421 ns	0.0219 ns	0.0321 ns	-	-

As you can see no extra allocations were made for collection expression. But be sure to measure against your actual code/data/hardware.

What is the most efficient way to create a temporary collection? Differences between stackalloc, and collection expressions?

There are 1 best solutions below

Related Questions in C#

Related Questions in .NET

Related Questions in PERFORMANCE

Related Questions in GARBAGE-COLLECTION

Related Questions in STACKALLOC

Trending Questions

Popular # Hahtags

Popular Questions