I'm doing a binary search with bsearch(array, arrays, NUM_ARRAYS, 16, compare_func)
and
int compare(const void *p1, const void *p2)
{
return memcmp(p1, p2, 16); // unsigned char arrays[][16]
}
Since it is 16 bytes, it would fit in a single 128-bit register.
How to modify this C function to force the comparison to be done with a 128-bit register CPU instruction? It should be much faster.
Linked questions: Comparison of 128 bit unsigned integers in x86-32 assembly, What are the 128-bit to 512-bit registers used for? but it doesn't answer directly.
If numbers are stored in big endian order and the pointers are aligned on 16 byte boundaries, the comparison as unsigned 128 bit values will produce the same result as
memcmp
. Whether it will be more efficient will depend on the compiler and optimisation settings.Here is a modified function:
The problem is your target system likely uses little endian order (eg: x86 CPUs). If your goal is to find the array in an array of arrays, you could still use this trick as long as the array is sorted using the same comparison.
Using
bsearch
requires a function pointer that returns a signed value equal to0
for elements that compare equal, is negative if the element pointed to byp1
is less than the one pointed to byp2
and a positive value otherwise. Another problem with this approach is type punning and alignment issues which produce undefined behavior.It would be safer and more efficient to write a binary search function that operates on an array of unions and uses a single comparison per iteration to locate the matching entry. This array must be sorted and sorting it can be performed using
qsort()
with thecompare128()
function.Here is an example:
On platforms without 128-bit integer support, you can use this: