I have a simple question in C language. I am implementing a half-precision software using _Float16 in C (My mac is based on ARM), but running time is not quite faster than single or double-precision software. I tested half, single, double with a very simple code like just adding the number. the speed of half is slower than single or double. In addition, single is similar to double.
typedef double FP;
// double - double precision
// float - single precision
// _Float16 - half precision
int main(int argc, const char * argv[]) {
float time;
clock_t start1, end1;
start1 = clock();
int i;
FP temp = 0;
for(i = 0; i< 100; i++){
temp = temp + i;
}
end1 = clock();
time = (double)(end1 - start1)/CLOCKS_PER_SEC;
printf("[] %.16f\n", time);
return 0;
}
In my expectation, half-precision is very faster than single or double precision. How can I check half-precision is faster and float is faster than double?.
Please Help Me.
Here is an eminently surprising fact about floating point:
How can this be? Floating-point arithmetic is hard, so doing it with twice the precision is at least twice as hard and must take longer, right?
Well, no. Yes, it's more work to compute with higher precision, but as long as the work is being done by dedicated hardware (by some kind of floating point unit, or FPU), everything is probably happening in parallel. Double precision may be twice as hard, and there may therefore be twice as many transistors devoted to it, but it doesn't take any longer.
In fact, if you're on a system with an FPU that supports both single- and double-precision floating point, a good rule is: always use
double. The reason for this rule is that typefloatis often inadequately accurate. So if you always usedouble, you'll quite often avoid numerical inaccuracies (that would kill you, if you usedfloat), but it won't be any slower.Now, everything I've said so far assumes that your FPU does support the types you care about, in hardware. If there's a floating-point type that's not supported in hardware, if it has to be emulated in software, it's obviously going to be slower, often much slower. There are at least three areas where this effect manifests:
floatmay be advantageous there.)floatordouble.