I recently wanted to write code where an instance A of struct S1 returns some instance B of struct S2 that needs some values from instance A. Trivial implementation of struct S2 would be to store copies of needed values from struct S1 inside of S2. Obviously depending on the case it could produce a fairly large struct. My idea was to store a reference to S1 inside S2. For that to work S2 needs to be a ref struct but I am ok with this.
Consider following code:
public readonly record struct TopLevelStruct(int A, int B) {
//This works. In IL this does get passed by ref into the implicit operator
public RefStruct Ref => this;
public RefStruct RefByMethod() => this;
//Compiler error:
//Cannot use a result of 'RefStruct.RefStruct(ref readonly TopLevelStruct)' in this context
//because it may expose variables referenced by parameter 'tls' outside of their declaration scope
public RefStruct RefWithConstructor => new(in this);
//Cannot use a result of 'RefStruct.RefStruct(ref readonly TopLevelStruct)' in this context
//because it may expose variables referenced by parameter 'tls' outside of their declaration scope
public RefStruct RefWithConstructorByMethod() => new(in this);
}
public readonly ref struct RefStruct(ref readonly TopLevelStruct tls) {
private readonly ref readonly TopLevelStruct _tls = ref tls;
public int Sum => _tls.A + _tls.B;
public static implicit operator RefStruct(in TopLevelStruct tls) => new(in tls);
}
public static class TopLevelStructExtensions {
//This works, even though RefWithConstructorByMethod doesn't while both are arguably the same
//(i.e. both functions will have a TopLevelStruct Reference as parameter.
//One implicit because of being a member function. The other one explicit because of being a extension method)
public static RefStruct GetRef(this ref readonly TopLevelStruct tls) => new(in tls);
public static RefStruct GetRefByOperator(this ref readonly TopLevelStruct tls) => tls;
}
For some reason returning a ref struct constructed using this by reference gives a compiler error(see code). I wonder what the exact reason for this error is. I don't seem to grasp it entirely.
Now even though the above does not work as member method, it does work as extension method. Why does the compiler consider member method implementation an error but an extension method as valid?
What is even more interesting is that a member method is able to construct the ref struct from this reference via an implicit conversion. Why can't this "[...] expose variables referenced by parameter 'tls' outside of their declaration scope".
That being said I am aware that with the above implementation following holds true:
TopLevelStruct tls = new(1, 2);
var refStruct = tls.Ref;
Console.WriteLine(refStruct.Sum); //Output: 3
tls = new(3, 4);
Console.WriteLine(refStruct.Sum); //Output: 7
In the end what I am wondering is if the compiler errors should happen in the cases where they don't or if they shouldn't happen where they do. Otherwise why do they happen where they do and don't happen where they don't?
Edit 1: Since I posted the question I realized that using a conversion operator to construct the ref struct is probably something the compiler should prohibit or at least warn us about however, I do not know if it is something that that the compiler would be able to detect with reasonable effort.
Now consider the following code:
ReadOnlySpan<int> span1 = GetSpan();
Span<int> buffer = stackalloc int[20];
for(int i = 0; i < buffer.Length; i++) buffer[i] = 420;
System.Console.WriteLine(span1[0]); // Output: 420 (because the stack unrolled after GetSpan() call)
Top top = new(42);
ReadOnlySpan<int> span2 = GetSpanSafe(in top);
buffer = stackalloc int[20];
for(int i = 0; i < buffer.Length; i++) buffer[i] = 420;
System.Console.WriteLine(span2[0]); // Output: 42 (top is still on the stack)
ReadOnlySpan<int> GetSpan() => new Top(42).Span;
ReadOnlySpan<int> GetSpanSafe(ref readonly Top top) => Top.ToSpan(in top);
//Similar struct to the one in the above code
readonly record struct Top(int I)
{
private readonly int _i = I;
// public ReadOnlySpan<int> Span => new(in _i);
public ReadOnlySpan<int> Span => (ReadOnlySpan<int>)this; //
public static explicit operator ReadOnlySpan<int>(in Top top) => new(in top._i);
public static ReadOnlySpan<int> ToSpan(ref readonly Top i) => new(in i._i);
}
In the first part we get the wrong output because our ref struct points to a temporary which since was popped from the stack. In the second part where we use a static version of our getter everything works as expected.
From what I understand use of ref readonly
parameter in the static version restricted the parameter to not be a temporary. I am not sure how it would behave if the Top struct would reside on the heap and then would be garbage collected. Neither do I know if ref locals and fields are managed or unmanaged, nor do I know how to verify if given heap memory was freed or not.
Unfortunately this does not give us a more readable syntax of a property. I wonder if it would make sense to introduce something to the language that would restrict member method calls to be called on actual variables(prohibit calling them on temporaries). Maybe scoped modifier would be suitable. I am not sure if the receiving variable would need to be a scoped local in that case or not.
There are, as the message suggests, some huge problems here with lifetimes and escape analysis - the compiler is trying hard to make sure you don't end up with an invalid pointer (managed or unmanaged), and in the general case: there are some problematic scenarios - for example:
is valid (although it does generate a warning); if we stored this managed pointer: where does it go?. This example is fairly trivial (the compiler generates a hidden local, so... meh) - but generalizing: it is ... complex.
The language does allow the
scoped
keyword to be used, but it doesn't quite do what you want; it allows a managed pointer to be passed in for access, but doesn't allow that managed pointer to escape - meaning: if you hadyou could read values from
tls
, but you can't store the reference. This means that your_tls
initializer fails, but everything else becomes happy.Fundamentally,
ref struct
and managed pointer rules are complex. Some things just aren't possible, and proving whether they should be (i.e. if they cannot lead to a problem) is incredibly hard.