How can I divide 16 8-bit integers by 4 (or shift them 2 to the right) using SSE intrinsics?
Divide 8-bit integers by 4 (or shift) using SSE
1.8k Views Asked by miho At
1
There are 1 best solutions below
Related Questions in C++
- C++ using std::vector across boundaries
- Linked list without struct
- Connecting Signal QML to C++ (Qt5)
- how to get the reference of struct soap inherited in C++ Proxy/Service class
- Why we can't assign value to pointer
- Conversion of objects in c++
- shared_ptr: "is not a type" error
- C++ template using pointer and non pointer arguments in a QVector
- C++ SFML 2.2 vectors
- Lifetime of temporary objects
- I want to be able to use 4 different variables in a select statement in c ++
- segmentation fault: 11, extracting data in vector
- How to catch delay-import dll errors (missing dll or symbol) in MinGW(-w64)?
- How can I print all the values in this linked list inside a hash table?
- Configured TTL for A record(s) backing CNAME records
Related Questions in X86
- Why do we need AX instead of MOV DS, data directly with a segment?
- Drawing circles of increasing radius
- Assembly print on screen using pop ecx
- How to add values from vector to each other
- Intel x64 instructions CMPSB/CMPSW/CMPSD/CMPSQ
- Compact implementation of logical AND in x86 assembly
- Can feenableexcept hurt a program performance?
- How do I display the result and remainder in ax and dx in Assembly (tasm)
- ASM : Trouble using int21h on real machine
- jmp instruction *%eax
- What steps are needed to load a second stage bootloader by name on a FAT32 file system in x86 Assembly?
- Assembly code to print a new line string
- Write System Call Argument Registers
- How to jump to an address saved in a register in intel assembly?
- Find middle value of a list
Related Questions in SSE
- How to add values from vector to each other
- Effective way to extract from SSE vector on AMD processors
- Assembly x64: Using MULPD instruction with integer
- Check whether __m128i is zero?
- Compare two 16-byte values for equality using up to SSE 4.2?
- assembly function with C segfault
- Tell C++ that pointer data is 16 byte aligned
- OpenCV FAST corner detection SSE implementation walkthrough
- Minimum and maximum of signed zero
- GCC emits vastly different code using "-march=native" on similar architectures
- 32-bit Hamming String formation from 32 8-bit comparisons
- Multiply-subtract in SSE
- 0xFFFF flags in SSE
- Is vectorization profitable in this case?
- How to split an XMM 128-bit register into two 64-bit integer registers?
Related Questions in SIMD
- OpenMP SIMD on Power8
- How to add values from vector to each other
- Effective way to extract from SSE vector on AMD processors
- Running Yeppp library with Mono on Raspbery Pi
- Store, modify and retrieve strings with GCC Vector Extensions?
- parallelizing matrix multiplication through threading and SIMD
- SSE - AVX conversion from double to char
- 32-bit Hamming String formation from 32 8-bit comparisons
- Optimizing SIMD histogram calculation
- Initializing int4 using Swift; bug or expected behaviour?
- Vectorize 2d-array access (GCC)
- Is it really efficient to use Karatsuba algorithm in 64-bit x 64-bit multiplication?
- (Vec4 x Mat4x4) product using SIMD and improvements
- What are some rules of thumb for when SIMD would be faster? (SSE2, AVX)
- How can I use simd in MIPS?
Related Questions in INTRINSICS
- Check whether __m128i is zero?
- Equivalents to gcc/clang's march=native in other compilers?
- YUYV 4:2:2 to ARGB conversion using intel intrinsics SSE/MMX
- How to efficiently perform int8/int64 conversion with SSE?
- Neon intrinsic to prevent overflow by subtracting the minimum element from all elements [no looping]
- VS: unexpected optimization behavior with _BitScanReverse64 intrinsic
- Are built-in intrinsic functions available in Swift 3?
- Divide 8-bit integers by 4 (or shift) using SSE
- How does JitIntrinsicAttribute affect code generation?
- Horizontal add with __m512 (AVX512)
- Xeon Phi Knights Corner intrinsics with GCC
- Determine what intrinsic flag is activated
- SSE intrinsics bit shifting to the right
- Does anybody know how to use Neon intrinsics uint8x8_t vclt_s8 (int8x8_t, int8x8_t)
- Function crashes when using _mm_load_pd
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Unfortunately there are no SSE shift instructions for 8 bit elements. If the elements are 8 bit unsigned then you can use a 16 bit shift and mask out the unwanted high bits, e.g.
For 8 bit signed elements it's a little fiddlier, but still possible, although it might just be easier to unpack to 16 bits, do the shifts, then pack back to 8 bits.