A pragmatic loop unrolling technique

Question

A pragmatic loop unrolling technique

559 Views Asked by Benjamin At 06 June 2025 at 15:25

I'm finding a pragmatic loop unrolling technique example.

I think Duff's device is a one of nice tip.
But Duff's device's destination is never increased. It could be useful for embeded programmer who copies data to serial device, not general programmers.

Could you give me a nice and useful example?
If you have ever used it in your real code, it will be better.

Original Q&A

There are 3 best solutions below

**John Zwinck** · Answer 1

John Zwinck On 03 April 2011 at 01:33

The most pragmatic technique would be to learn and love your compiler's optimization options, and occasionally inspect the generated assembly by hand if you encounter hotspots in profiling.

**holtavolt** · Answer 2

(For the benefit of others, background on Duff's device can be found here and here)

I've encountered it in image processing optimizations, especially to handle border conditions where fewer pixels than a complete tile or kernel are to be copied (and this can avoid the test at each coordinate.)

**Potatoswatter** · Answer 3

I'm not sure what you mean by "destination is never increased."

Manual loop unrolling is rather uncommon. Embedded microprocessors today are fast enough that such optimization is unnecessary (and would waste valuable program memory).

I use a variation of Duff's device in a linear solver kernel. There must be one back_step for each fwd_step, and they are performed in groups of four.

Note that the forward and backward-going loops are implemented by gotos. When the if in fwd_step is skipped, execution jumps into the middle of the backward loop. So it's really a kind of double Duff's device.

This isn't any kind of "pragmatic" technique, it's just the best way I could find to express some very convoluted flow control.

switch ( entry ) {

#define fwd_step( index )                                                                                         \
                                                                                                     \
case (index):                                                                                                    \
    if ( -- count ) {                                                                                               \
        ...

startf:
    fwd_step( 0 )
    fwd_step( 1 )
    fwd_step( 2 )
    fwd_step( 3 )

        stream = stream_back;
        goto startf;


#define back_step( index )                                                            \
        .... \
    }                                                                                    \

startb:
    stream -= block_size;

    back_step( 3 )
    if ( ! -- countb ) break;
    back_step( 2 )
    if ( ! -- countb ) break;
    back_step( 1 )
    if ( ! -- countb ) break;
    back_step( 0 )
        if ( -- countb ) goto startb;
} // end switch

A pragmatic loop unrolling technique

There are 3 best solutions below

Related Questions in C++

Related Questions in C

Related Questions in OPTIMIZATION

Related Questions in LOOPS

Related Questions in DUFFS-DEVICE

Trending Questions

Popular # Hahtags

Popular Questions