The intel intrinsic functions have the subtype of the vector built into their names. For example, _mm_set1_ps is a ps, which is a packed single-precision aka. a float. Although the meaning of most of them is clear, their "full name" like packed single-precision isn't always clear from the function descriptions. I have created the following table. Unfortunately some entries are missing. What are the value of them? Additional questions below the table.
| abbreviation | full name | C/++ equivalent |
|---|---|---|
| ps | packed single-precision | float |
| ph | packed half-precision | None** |
| pd | packed double-precision | double |
| pch | packed half-precision complex | None** |
| pi8 | ??? | int8_t |
| pi16 | ??? | int16_t |
| pi32 | ??? | int32_t |
| epi8 | ??? | int8_t |
| epi16 | ??? | int16_t |
| epi32 | ??? | int32_t |
| epi64 | ??? | int64_t |
| epi64x | ??? | int64_t |
Additional questions:
- Have I missed any?
- What is the difference between
epiXandpiX? - Why does no
pi64exist? - What is the difference between
epi64andepi64x?
** I have found this, but there seems to be no standard way to represent a half precision (complex) value in C/++. Please correct me if this has changed in any way.
The missing versions are at least si128 and si64, used in bitwise operations and
[e]pu{8,16,32,64}for unsigned operations.epi and pi differ in
eprobably meaning extended; epi register target is an 128 bit xmm register, while pi targets 64-bit mmx registers.pi64 does not exists, because the original mmx instruction set was limited to 32-bit elements; si64 is still available.
The main argument for using epi64x instead of epi64 needs to do with lack of function overloading in C. There was need to provide set/conversion methods both for
__m128i _mm_set1_epi64(__m64)which moves from MMX to XMM and for__m128i _mm_set1_epi64x(int64_t)working with integers. Additionally it seems that in the rest of the cases the 64x suffix is reserved for modes requiring 64-bit architecture, as inmovqbetween a register and low half of__m128i, which could be emulated by multiple instruction, and for something like__int64 _mm_cvtsd_si64x (__m128d a), which converts a double to 64-bit register target (not to memory directly).What I would speculate, is that 'si64' and 'si128' mean scalar integer of width 64/128_, notice that there exists
_mm_add_si64(that is not original SSE intrinsic, that is SSE2 intrinsic extending the original MMX instruction set and using MMX registers). It'ssi64, notpi64, because only one element of the same size as the whole register is involved.Lastly piN means packed integer of element size N targeting MMX (__m64) and epiN means packed integer of elements size N targeting XMM (__m128i).