I'm finding a pragmatic loop unrolling technique example.
I think Duff's device is a one of nice tip.
But Duff's device's destination is never increased. It could be useful for embeded programmer who copies data to serial device, not general programmers.
Could you give me a nice and useful example?
If you have ever used it in your real code, it will be better.
The most pragmatic technique would be to learn and love your compiler's optimization options, and occasionally inspect the generated assembly by hand if you encounter hotspots in profiling.