How much of this `inline for` will be abbreviated at comptime in Zig?

135 Views Asked by At

I'm trying a thing that iterates over the fields of a struct and does a different action according to the type of the field:

const std = @import("std");

const S = struct {
    letter: u8,
    number: u64,
};

pub fn main() void {
    var result: i64 = 0;
    var s = S{ .letter = 'a', .number = 77 };
    inline for ([_][]const u8{ "letter", "number" }) |fname| {
        var v = @field(s, fname);
        switch (@TypeOf(v)) {
            u8 => result += 1,
            u64 => result += 2,
            else => unreachable,
        }
    }
    std.debug.print("result={}\n", .{result});
}

My question is: will the field differentiation logic (the switch) happen at comptime?, i.e. will the actual compile program be similar to

result += 1;
result += 2;

-- or will it be more like

const type1 = u8;
switch (type1) {
  u8 => result += 1,
  u64 => result += 2,
  else => unreachable,
}
const type2 = u64;
switch (type2) {
  u8 => result += 1,
  u64 => result += 2,
  else => unreachable,
}

I was assuming these switches were unnecessary since everything required to compute these things are known at compile time, but apparently the switch is being evaluated, because if I declare S.number to be u32, for example, the execution panics at the unreachable.

If that is the case, how can I rewrite this to make it more performant?

2

There are 2 best solutions below

0
On

My question is: will the field differentiation logic (the switch) happen at comptime?, i.e. will the actual compile program be similar to

result += 1;
result += 2;

Yes, but not because of the inline for, it's because the thing you're switching on is comptime only, a type. That means the entire switch will be replaced by whichever branch was hit at comptime, the inline for just lets you do this more than once.

So if we deconstruct it step by step (as the compiler would do) it might look something like this:

Step 1:

inline for ([_][]const u8{ "letter", "number" }) |fname| {
  var v = @field(s, fname);
  switch (@TypeOf(v)) {
    u8 => result += 1,
    u64 => result += 2,
    else => unreachable,
  }
}

Step 2:

{
  // fname gets replaced by the actual value
  var v = @field(s, "letter");
  switch (@TypeOf(v)) {
    u8 => result += 1,
    u64 => result += 2,
    else => unreachable,
  }
}
{
  var v = @field(s, "number");
  switch (@TypeOf(v)) {
    u8 => result += 1,
    u64 => result += 2,
    else => unreachable,
  }
}

Step 3:

{
  // v is removed in non-debug build modes
  // because it isnt referenced in the final result

  // the switch is replaced with whatever branch was hit
  result += 1;
}
{
  result += 2;
}

You can see this when you inspect the assembly code generated, even in debug mode it's just equivalent to:

var v1 = @field(s, "letter");
result += 1;
var v2 = @field(s, "number");
result += 2;

Also, recommendation for iterating over a type's fields is to use either @typeInfo(T).Struct.fields, or std.meta.fields like this:

inline for (@typeInfo(S).Struct.fields) |field| {
  switch (field.type) {
    // ...
  }
}
3
On

inline for is unrolled and the code for each iteration is inlined into the resulting code for compilation. In other words, this

inline for (...) {
    // some code ...
}

becomes

// some code ...
// some code ...

If that is the case, how can I rewrite this to make it more performant?

This doesn't really have affect on performance, because in release mode the compiler will optimize this code. For example, I've used godbolt to check the output assembly for a similar code and the compiler ended up reducing it all down to:

example.main:
        push    rax
        mov     rdi, rsp
        mov     qword ptr [rsp], 3
        call    "log.scoped(.default).err__anon_3359"
        pop     rax
        ret