Do you know any way to add with saturation 32-bit signed words using MMX/SSE assembler instructions? I can find 8/16 bits versions but no 32-bit ones.
Add 32-bit words with saturation
2.9k Views Asked by LooPer At
2
There are 2 best solutions below
0
Michiel
On
Saturated unsigned subtraction is easy, because for `a -= b', we can do
asm (
"pmaxud %1, %0\n\t" // a = max (a,b)
"psubd %1, %0" // a -= b
: "+x" (a)
: "xm" (b)
);
with SSE.
I was looking for unsigned addition, but possibly, the only way is to transform to a saturated unsigned subtraction, perform it, and transform back. Same for signed variants.
EDIT: with unsigned addition, you get min (a, ~b) + b this way, which of course works. With signed addition and subtraction, you have two saturation boundaries, which makes things complicated.
Related Questions in ASSEMBLY
- (x64 Nasm) Writeline function on Linux
- Is the compiler Xcode uses to produce Assembly code a bad compiler?
- Why do we need AX instead of MOV DS, data directly with a segment?
- Bootloader in Assembly with Linux kernel
- How should the byte sequence 0x40 0x55 be interpreted by an x86-64 emulator?
- C++ code into assembly
- Drawing circles of increasing radius
- Assembly print on screen using pop ecx
- Equivalent to asm volatile in Gfortran?
- Show 640x480 BMP image with inline ASM c++
- Keep track of numbers entered in by a user in assembly
- 8086 Assembly Arrays with I/O
- DB ASM variable in Inline ASM C++
- What does Jump to means in callgrind?
- How to convert binary into decimal in assembly x8086?
Related Questions in X86
- Why do we need AX instead of MOV DS, data directly with a segment?
- Drawing circles of increasing radius
- Assembly print on screen using pop ecx
- How to add values from vector to each other
- Intel x64 instructions CMPSB/CMPSW/CMPSD/CMPSQ
- Compact implementation of logical AND in x86 assembly
- Can feenableexcept hurt a program performance?
- How do I display the result and remainder in ax and dx in Assembly (tasm)
- ASM : Trouble using int21h on real machine
- jmp instruction *%eax
- What steps are needed to load a second stage bootloader by name on a FAT32 file system in x86 Assembly?
- Assembly code to print a new line string
- Write System Call Argument Registers
- How to jump to an address saved in a register in intel assembly?
- Find middle value of a list
Related Questions in SSE
- How to add values from vector to each other
- Effective way to extract from SSE vector on AMD processors
- Assembly x64: Using MULPD instruction with integer
- Check whether __m128i is zero?
- Compare two 16-byte values for equality using up to SSE 4.2?
- assembly function with C segfault
- Tell C++ that pointer data is 16 byte aligned
- OpenCV FAST corner detection SSE implementation walkthrough
- Minimum and maximum of signed zero
- GCC emits vastly different code using "-march=native" on similar architectures
- 32-bit Hamming String formation from 32 8-bit comparisons
- Multiply-subtract in SSE
- 0xFFFF flags in SSE
- Is vectorization profitable in this case?
- How to split an XMM 128-bit register into two 64-bit integer registers?
Related Questions in MMX
- How to use MMX code in c# for image processing
- How to add each byte of an 8-byte long integer?
- Unable to activate the SSE instruction set by "-march=native" in gcc or any other flags in Core2 chip
- How to convert 'long long' (or __int64) to __m64
- From an fxsave dump, how to determine whether in x87 or MMX mode?
- VHDL: Designing an arithmetic unit with MMX x86 instructions for operand sizes from 64 to 8 bits
- How to prepare data for use with MMX/SSE intrinsics for shifting 16bit values?
- Image Processing with MMX in Linux
- MMX error A2022:instruction operands must be the same size
- How to use MMX in parallel with SSE operations
- MMX - working with constant bytes
- How to save a value to a variable using mmx ? (c++)
- How to add all the elements of an array using MMX?
- -g flag changes runtime and compilation of program
- Invalid instruction operand when using punpcklwd with MMWORD PTR 64-bit memory operand
Related Questions in SATURATION-ARITHMETIC
- How can I increment a variable without exceeding a maximum value?
- Saturate 16-bit signed integer to 12-bits signed
- Using CSS filter to increase Saturation on webpage and Streaming video container
- Cast type with range limit
- How to handle addition and subtraction beyond Integers MAX_VALUE and MIN_VALUE?
- ColorMatrix Saturation and OpenCV Saturation result are different
- Do you know any saturation function? To make number fit to given range?
- Importance of Q(Saturation Flag) in ARM
- OpenGL handling float color saturation ("color overflow")?
- Add 32-bit words with saturation
- Bitwise saturated addition in C (HW)
- Fast saturating integer conversion?
- C - Saturating Signed Integer Multiplication with Bitwise Operators
- Add saturate 32-bit signed ints intrinsics?
- Mult plus shift left ops using MMX assembler instructions
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
You can emulate saturated signed adds by performing the following steps:
Unsigned, it's even simpler, see this stackoverflow posting
In SSE2, the above maps to a sequence of parallel compares and AND/ANDN operations. No single operation is available in hardware, unfortunately.