__device__ __half2 __h2div ( const __half2 a, const __half2 b )
Description:
Divides half2 input vector a by input vector b in round-to-nearest mode.
__device__ __half2 __hmul2 ( const __half2 a, const __half2 b )
Description:
Performs half2 vector multiplication of inputs a and b, in round-to-nearest-even mode.
Can someone explain me what exact operations are happening for both of these?
 
                        
Both are elementwise operations. A
__half2is a vector type, meaning it has multiple elements (2) of a simpler type, namelyhalf(i.e. 16-bit floating point quantity.) These vector types are basically structures where the individual elements are accessed using the structure references.x,.y,.z, and.w, for vector types up to 4 elements.If we have two items (
a,b) that are each of__half2type:the division operation:
will create a
resultwhere the first element ofresultis equal to the first element ofadivided by the first element ofb, and likewise for the second element.This means when complete, the following statements should "approximately" be correct:
The multiplication operation:
will create a
resultwhere the first element ofresultis equal to the first element ofamultiplied by the first element ofb, and likewise for the second element.This means when complete, the following statements should "approximately" be correct:
("approximately" means there may be rounding differences, depending on your exact code and possibly other factors, like compile switches)
Regarding rounding, its no different than when these terms are applied in other (non CUDA) contexts. Roughly speaking:
"round to nearest" is what I would consider the usual form of rounding. When an arithmetic result is not exactly representable in the type, the nearest type representation will be chosen so that:
"round to nearest even" is a modification of the above description to choose the closest type representation in the exact midpoint case that has an even numbered least significant digit.