Can speculative execution of modern CPUs cross loop iterations?

141 Views Asked by Changbin Du At 29 June 2025 at 19:54

Consider below loop (https://godbolt.org/z/z4Wz1aanK) that has no loop-carried dependence. Will modern CPU speculatively execute next iteration with previous one? if true, is loop expansion still necessary here?

void bar(void)
{
    for (int i = 0; i < 1024; i++)
    out[i] = foo(src[i]);
}

The result of compilation:

bar():
       pushq   %rbx
       xorl    %ebx, %ebx
.L2:
       movl    src(%rbx), %edi
       addq    $4, %rbx
       call    foo(int)
       movl    %eax, out-4(%rbx)
       cmpq    $4096, %rbx
       jne     .L2
       popq    %rbx
       ret
src:
       .zero   400
out:
       .zero   400

Update1: Now I am sure speculative execution can cross loop iterations. The question is how far that can be, considering dependency chain introduced by loop count i?

Original Q&A

There are 1 best solutions below

julaine On 24 July 2023 at 07:27

Yes, this loop will likely benefit from branch prediction / speculative execution.

Loop unrolling by hand is generally considered to be an obsolete optimization, see for example here: https://www.intel.com/content/www/us/en/developer/articles/technical/avoid-manual-loop-unrolling.html

Speculative execution does not change the observed behaviour of your program. It does not even require compiler-support since it is something the CPU itself does when it encounters conditional jumps. Whether your iterations will be correctly predicted will depend on what happens inside of foo and possibly even the data in src. If foo has too many conditionals or if the conditionals follow hard-to-predict patterns, the speed will be lower.

Other optimizations may appear in the code though if the compiler thinks they are beneficial: There might be loop unrolling, there might be SIMD-operations. To see what the compiler actually does with your code you can try https://godbolt.org/

Can speculative execution of modern CPUs cross loop iterations?

There are 1 best solutions below

Related Questions in ASSEMBLY

Related Questions in X86-64

Related Questions in PIPELINE

Related Questions in CPU-ARCHITECTURE

Related Questions in SPECULATIVE-EXECUTION

Trending Questions

Popular # Hahtags

Popular Questions