Why does this very simple C# method produce such illogical CIL code?

1k Views Asked by At

I've been digging into IL recently, and I noticed some odd behavior of the C# compiler. The following method is a very simple and verifiable application, it will immediately exit with exit code 1:

static int Main(string[] args)
{
    return 1;
}

When I compile this with Visual Studio Community 2015, the following IL code is generated (comments added):

.method private hidebysig static int32 Main(string[] args) cil managed
{
  .entrypoint
  .maxstack  1
  .locals init ([0] int32 V_0)     // Local variable init
  IL_0000:  nop                    // Do nothing
  IL_0001:  ldc.i4.1               // Push '1' to stack
  IL_0002:  stloc.0                // Pop stack to local variable 0
  IL_0003:  br.s       IL_0005     // Jump to next instruction
  IL_0005:  ldloc.0                // Load local variable 0 onto stack
  IL_0006:  ret                    // Return
}

If I were to handwrite this method, seemingly the same result could be achieved with the following IL:

.method static int32 Main()
{
  .entrypoint
  ldc.i4.1               // Push '1' to stack
  ret                    // Return
}

Are there underlying reasons that I'm not aware of that make this the expected behaviour?

Or is just that the assembled IL object code further optimized down the line, so the C# compiler does not have to worry about optimization?

3

There are 3 best solutions below

4
Jon Skeet On BEST ANSWER

The output you've shown is for a debug build. With a release build (or basically with optimizations turned on) the C# compiler generates the same IL you'd have written by hand.

I strongly suspect that this is all to make the debugger's work easier, basically - to make it simpler to break, and also see the return value before it's returned.

Moral: when you want to run optimized code, make sure you're not asking the compiler to generate code that's aimed at debugging :)

1
nstosic On

What I'm about to write isn't really .NET specific but general, and I don't know the optimizations that .NET recognizes and uses when generating CIL. The syntax tree (and by it the grammar parser itself) recognizes return statement with following lexemes:

returnStatement ::= RETURN expr ;

where returnStatement and expr are non-terminals and RETURN is the terminal (return token) so when visiting the node for constant 1 the parser is behaving as if it's part of an expression. To further illustrate what I mean, the code for:

return 1 + 1;

would look something like this for a (virtual) machine using expression stack:

push const_1 // Pushes numerical value '1' to expression stack
push const_1 // Pushes numerical value '1' to expression stack
add          // result = pop() + pop(); push(result)
return       // pops the value on the top of the stack and returns it as the function result
exit         
0
Eric Lippert On

Jon's answer is of course correct; this answer is to follow up on this comment:

@EricLippert the local makes perfect sense, but is there any rationale for that br.s instruction, or is it just there out of convenience in the emitter code? I guess that if the compiler wanted to insert a breakpoint placeholder there, it could just emit a nop...

The reason for the seemingly senseless branch becomes more sensible if you look at a more complicated program fragment:

public int M(bool b) {
    if (b) 
      return 1; 
    else 
      return 2;
}

The unoptimized IL is

    IL_0000: nop
    IL_0001: ldarg.1
    IL_0002: stloc.0
    IL_0003: ldloc.0
    IL_0004: brfalse.s IL_000a
    IL_0006: ldc.i4.1
    IL_0007: stloc.1
    IL_0008: br.s IL_000e
    IL_000a: ldc.i4.2
    IL_000b: stloc.1
    IL_000c: br.s IL_000e
    IL_000e: ldloc.1
    IL_000f: ret

Notice that there are two return statements but only one ret instruction. In unoptimized IL, the pattern for codegen'ing a simple return statement is:

  • stuff the value you're going to return into a stack slot
  • branch/leave to the end of the method
  • at the end of the method, read the value out of the slot and return

That is, the unoptimized code uses single-point-of-return form.

In both this case and the simple case shown by the original poster, that pattern causes a "branch to next" situation to be generated. The "remove any branch to next" optimizer does not run when generating unoptimized code, so it remains.