I was trying to work on a AVX512 code. While working on the same, was trying to look for a function similar to _mm256_sign_epi8 in AVX512 but wasn't able to find an equivalent. It would be really useful if we find a similar instruction. Is there an equivalent instruction or any other alternate way to do this for AVX512 with similar/lesser CPI/latency ? Thanks
AVX2 function example
z = _mm256_sign_epi8(x,y)
Based on sign of elements of y, sign of elements of x is also updated
There is no direct alternative of _mm256_sign_epi8 in AVX512.
Quoting https://lemire.me/blog/2024/01/11/implementing-the-missing-sign-instruction-in-avx-512/ , one possible replacement is: