I was recently faced with a given problem:
There are 8 elements in the vector, each is represented by int8_t.
Implement an algorithm in x86_64 that will add two vectors (uint64_t type).
Adding elements should be done with saturation arithmetic taken into account.
E.g.:
80 + 60 = 127
(−40) + (−100) = −128
The biggest challenge turns out to be the restrictions imposed:
- No conditional instructions except ret; no jumps, cmove, set, etc.
- The solution cannot be longer than 48 instructions (there exists a solution shorter than 37 instructions)
I can't think of any solution that fits these restrictions. Could anyone give me some hints? Examples in C are welcome.
I can use only "standard", transfer, arithmetic, logical instructions and standard registers:
- mov cbw/cwde/cdqe cwd/cdq/cqo movzx movsx
- add sub imul mul idiv div inc dec neg
- and or xor not sar sarx shr shrx shl shlx ror rol
- lea ret
I wrote it in C++ like this:
Then I went over to https://godbolt.org/ to see what various compilers generate. clang-16 gives 33 instructions:
You can try the various other options.