Is it beneficial to use glibc's strlen()/strcmp() or roll your own based on SSE4.2?

484 Views Asked by At

According to "Schema Validation with Intel® Streaming SIMD Extensions 4 (Intel® SSE4)" (Intel, 2008) [they] added instructions to assist in character searches and comparison on two operands of 16 bytes at a time. I wrote some basic strlen() and strcmp() functions in C, but they seem slower than glibc.

I would like to maybe experiment with using inline assembly to see how my project behaves with inputting/outputting XML.

I've read (on here) that using SMID on things like strlen() is rife with potential problems (memory alignment), so I'm a little concerned about using it in production code.

1

There are 1 best solutions below

4
Pascal Getreuer On

glibc's implementations will be hard to beat. These functions are carefully optimized and include pieces hand written in assembly. Here is glibc's x86_64 implementation of strcmp, using AVX2 instructions. Be warned, it is 800 lines:
https://github.com/lattera/glibc/blob/master/sysdeps/x86_64/multiarch/strcmp-avx2.S

For more detail, read also Peter Codes' fantastic explanation about glibc's implementation.