I see many instruction with shorthand such as "_mm_and_si128". I want to know what does the "mm" mean.
In SIMD, SSE2,many instructions named as "_mm_set_epi8","_mm_cmpgt_epi8 " and so on,what does "mm" "epi" mean?
418 Views Asked by dongwang At
1
There are 1 best solutions below
Related Questions in C++
- How to immediately apply DISPLAYCONFIG_SCALING display scaling mode with SetDisplayConfig and DISPLAYCONFIG_PATH_TARGET_INFO
- Why can't I use templates members in its specialization?
- How to fix "Access violation executing location" when using GLFW and GLAD
- Dynamic array of structures in C++/ cannot fill a dynamic array of doubles in structure from dynamic array of structures
- How do I apply the interface concept with the base-class in design?
- File refuses to compile std::erase() even if using -std=g++23
- How can I do a successful map when the number of elements to be mapped is not consistent in Thrust C++
- Can std::bit_cast be applied to an empty object?
- Unexpected inter-thread happens-before relationships from relaxed memory ordering
- How i can move element of dynamic vector in argument of function push_back for dynamic vector
- Brick Breaker Ball Bounce
- Thread-safe lock-free min where both operands can change c++
- Watchdog Timer Reset on ESP32 using Webservers
- How to solve compiler error: no matching function for call to 'dmhFS::dmhFS()' in my case?
- Conda CMAKE CXX Compiler error while compiling Pytorch
Related Questions in SIMD
- What is Win32 x86-64 CONTEXT::VectorRegister for?
- Avx2 intrinsics don't use all registers available. .NET 8
- How to convert DoubleVector to IntVector in Java Vector API?
- Understanding throughput of simd sum implementation x86
- SIMD method to get all consecutive sums of 4 or 8 DWORD integers (prefix-sum within each vector)
- Convert Variable Width Bitstream (2-bit or 4-bit symbols) into Fixed Width
- How can I adapt my code using Math.round and remainder on integer-valued FP double into a Java code using SIMD instructions?
- What is the benefit of using SIMD to pre-calculate the branching results?
- Extract icons from exe in Rust?
- How to load uint8_t "as" 32 bits integer efficiently into a SIMD register?
- Dot-product groups of 4 bytes against 4 small constants, over an array of bytes (efficiently using SIMD)?
- Intel classic compiler reports non-unit strided load in simple assignment
- Optimizing Mandelbrot Set Calculation in C++ on a High-Performance CPU
- AVX2 vectorization for code similar to prefix sum (decrement by count of preceding matches in short fixed-length arrays)
- SIMD performance does not look right
Related Questions in SSE
- Vector by Scalar Division with -ffast-math
- SIMD method to get all consecutive sums of 4 or 8 DWORD integers (prefix-sum within each vector)
- Can std::replace implementation make redundant writes to the passed array?
- How does MSVC avoid mixing SSE and AVX?
- "Simple" Vector SIMD operations in Assembly ( v1 + v2 -> v3 ) called from C#
- Grayscale filter in assembly doesn't work on smaller images
- Parsing integers from string using SIMD
- Why is it quicker to calculate the reciprocal square root than to compute the square root?
- `_mm_pow_ps `and similar functions are not recognized
- Intel xmm registers do not load and multiply correctly
- Are there several same-effect instructions in SSE/AVX?
- SSE Instruction to load Bytes with Zero Extension?
- Unexpected Output While std::cout float32 data twice, which previously swapped by _mm_shuffle_pi16
- x86 Intrinsic : FIR for complex float input
- How to vectorize a vector-matrix product with SSE?
Related Questions in INTRINSICS
- Avx2 intrinsics don't use all registers available. .NET 8
- How do I modify this intrinsics code going from YUV420 to RGB24 to output RGBA32
- AVX512 perform AND of 512bits of 8-bit chars
- ARM Neon Intrinsics - Lanes in FMA
- Why the distinction between WMMA and "just" MMA instructions?
- avoid memory errors with AVX intinsics
- How to call _mm256_mul_ph from rust?
- `_mm_pow_ps `and similar functions are not recognized
- How do you compute the bitwise exclusive prefix parity on ARM Neon?
- _mm256_insert_epi32() has no effect
- C program compiled with gcc -msse2 contains AVX1 instructions
- What is the difference between "mask_mov" and "mask_blend" when using intrinsics / AVX?
- ptwrite intrinsic ordering guarantees
- Extract 8 bit integer from __m512i data type (AVX-512)
- How to optimize a test to check if std::array<float, 4> contains an out of range value?
Related Questions in SSE2
- Clamp unsigned int to 0x10000 using SSE2
- How to add an alpha channel very fast to a RGB image using SSE2 and c++
- Suggestions on further optimising this chi-square function using SSE2 intrinsics
- Matrix multiplication using simd produces incorrect results when filled with floating point values
- Sum of bytes in an __m128 register
- Why isn't Avx.Multiply significantly faster than the * operator?
- In SIMD, SSE2,many instructions named as "_mm_set_epi8","_mm_cmpgt_epi8 " and so on,what does "mm" "epi" mean?
- MOVDQU vs MOVDQA Instruction (x86/x64 assembly) better insights
- Efficiently find indices of 1-bits in large array, using SIMD
- C++ std::countr_zero() in SIMD 128/256/512 (find position of least significant 1 bit in 128/256/512-bit number)
- Having array of 16/32/64 bytes how to quickly find index of first byte equal to given, using SSE2/AVX/AVX2/AVX-512
- How can I implement Bit Shift Right and Bit Shift Left by Vector for 8-bit and 16-bit integers in SSE2?
- Access value from __m128 in rust by index
- Why some of sse intrinsics introduce move back and forth?
- AVX divide __m256i packed 32-bit integers by two (no AVX2)
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
See What are the names and meanings of the intrinsic vector element types, like epi64x or pi32? for the element types.
The
_mm_function naming very likely stands for MMX or Multi Media, or themm0-7register naming in assembly. Intel starting this naming scheme for C intrinsics with the first SIMD extension they introduced for x86, MMX, which used 64-bit vectors (in registers mm0-7, or the C intrinsic type__m64).Officially, and apparently as a legal defense that lets them trademark MMX, it's not an initialism for something longer. But unofficially it's widely thought of as Multi-Media eXtensions.
SSE2 added 128-bit versions of those integer-SIMD instructions, using the XMM0-7 registers (XMM0-15 in 64-bit mode) introduced with SSE1, which mostly added single-precision floating point in those registers, and some new integer instructions on MMX registers. (SSE2 also added scalar and packed-double in XMM regs.) See the tag wiki, https://stackoverflow.com/tags/sse/info, for more history.
Intel continued their naming pattern, like
_mm_add_epi8as the SSE2 128-bit version of MMX_mm_add_pi8, not changing the intro to_xmm_addor anything like that. As discussed in What are the names and meanings of the intrinsic vector element types, like epi64x or pi32?, theefor Extended is what indicates that it's a a vector wider than 64-bit of packed i8 or u8 or whatever.The
__m64/__m128itype names don't seem to stand for anything, but the similar naming to_mm_and_mm256function names is clearly for association.