What are vector division and multiplication as in CUDA __half2 arithmetic?

281 Views Asked by At

__device__​ __half2 __h2div ( const __half2 a, const __half2 b )
Description:
Divides half2 input vector a by input vector b in round-to-nearest mode.

__device__​ __half2 __hmul2 ( const __half2 a, const __half2 b )
Description:
Performs half2 vector multiplication of inputs a and b, in round-to-nearest-even mode.

Can someone explain me what exact operations are happening for both of these?

1

There are 1 best solutions below

0
On BEST ANSWER

Both are elementwise operations. A __half2 is a vector type, meaning it has multiple elements (2) of a simpler type, namely half (i.e. 16-bit floating point quantity.) These vector types are basically structures where the individual elements are accessed using the structure references .x, .y, .z, and .w, for vector types up to 4 elements.

If we have two items (a, b) that are each of __half2 type:

the division operation:

__half2 a,b;
__half2 result = __hdiv2(a, b);

will create a result where the first element of result is equal to the first element of a divided by the first element of b, and likewise for the second element.

This means when complete, the following statements should "approximately" be correct:

result.x == a.x/b.x;
result.y == a.y/b.y;

The multiplication operation:

__half2 a,b;
__half2 result = __hmul2(a, b);

will create a result where the first element of result is equal to the first element of a multiplied by the first element of b, and likewise for the second element.

This means when complete, the following statements should "approximately" be correct:

result.x == a.x*b.x;
result.y == a.y*b.y;

("approximately" means there may be rounding differences, depending on your exact code and possibly other factors, like compile switches)

Regarding rounding, its no different than when these terms are applied in other (non CUDA) contexts. Roughly speaking:

"round to nearest" is what I would consider the usual form of rounding. When an arithmetic result is not exactly representable in the type, the nearest type representation will be chosen so that:

  • if the exact result is closer to the next closest type-representable value closer to zero, the type value closer to zero will be chosen
  • if the exact result is closer to the type-representable value closer to positive or negative infinity, the type value closer to positive or negative infinity will be chosen
  • if the exact result is exactly at the midpoint between the two closest type-representable values, the type value closer to positive or negative infinity will be chosen.

"round to nearest even" is a modification of the above description to choose the closest type representation in the exact midpoint case that has an even numbered least significant digit.