I do not understand whether there is function overloading in Cuda or not. I want to explain my problem on the following two functions, which I want to be able to use both on the GPU and the CPU, and I don't care about precision:
__host__ __device__
float myabs( float v ) {
return abs( v + 1 ); //I want the floating point absolute value
}
__host__ __device__
float mycos( float v ) {
return 2.f*cos( v );
}
- Which function of
abs
, resp.cos
should I call, and why?std::abs
/abs
/fabs
/fabsf
/anythingelse
std::cos
/cos
/cosf
/__cosf
/anythingelse
(Since __cosf
is a Cuda-intrinsic and std::abs
/std::cos
are not available in Cuda, I assume I have to use preprocessor directives inside my functions for those choices.)
Which headers should I include?
Does the answer to the first two questions depend on whether I compile with fast-math flags (e.g.
-ffast-math
).
If this important for the answer, I am compiling with nvcc 10.2 under Ubuntu 18.04.4., but I am rather interrested in a platform independent answer.
If you are using floating point arguments, then conventionally you would use
fabs
andcosf
. Those are the standard CUDA Math API implementations (and they correspond to the names of equivalent C standard library functions).Conventionally you should include either
math.h
orcmath
No. Neither of those functions will be substituted for fast intrinsics by fast math.