C23 introduced a number of floating point types, including but not limited to:
_Float32
_Float32x
_Float32_t
I am unsure of the differences, such as:
- Are they keywords, or are they type aliases, or something else?
- Are they distinct types, or can they be aliases for
float
? - What is the minimum range and precision of these types?
- Are they required to be IEEE-754-compliant (or IEC 60559)?
- Is
float
obsoleted by_Float32
or other types?
The same questions apply to _Float64
vs double
, and _Float128
vs long double
.
Only
_FloatN_t
types (e.g._Float32_t
) are aliases from the<math.h>
header. All the other types are required to be distinct, and their names are keywords. (See H.5.1 [Keywords])All of the types fall into one of four categories (see below). Choose between them as follows:
float
,double
, andlong double
, if you are satisfied with the very lenient requirements of these types__STDC_IEC_60559_BFP__
is defined, which makes them stricterfloat
anddouble
if you are okay with them being the same type1)_FloatN
if you need a specific IEC 60559 type with exactly N bits_FloatNx
if you need an extended IEC 60559 type with minimum N precision_FloatN_t
if you don't need IEC 60559 types, and you are not satisfied with the minimum requirements forfloat
anddouble
1) On architectures without a double-precision FPU,
float
anddouble
might be the same size (e.g. Arduino). Use other types (e.g._Float64_t
overdouble
) if you want software emulation of double-precision instead.Standard floating types
float
,double
, andlong double
are collectively called standard floating types. Their representation is implementation-defined, but there are some requirements nonetheless:double
must be able to represent anyfloat
, andlong double
must represent anydouble
__STDC_IEC_60559_BFP__
is defined,float
anddouble
are represented like_Float32
and_Float64
float
FLT_DECIMAL_DIG
≥ 6FLT_MIN
≤ 10-37FLT_MAX
≥ 1037double
DBL_DECIMAL_DIG
≥ 10DBL_MIN
≤ 10-37DBL_MAX
≥ 1037long double
LDBL_DECIMAL_DIG
≥ 10LDBL_MIN
≤ 10-37LDBL_MAX
≥ 1037Usually,
float
anddouble
are binary32 and binary64 types respectively, andlong double
is binary128, an x87 80-bit extended floating-point number, or represented same asdouble
.See C23 Standard - E [Implementation limits]
Interchange floating types
_Float32
,_Float64
etc. are so called interchange floating types. Their representation must follow the IEC 60559 interchange format for binary floating-point numbers, such as binary32, binary64, etc. Any_FloatN
types must be exactly N bits wide.The types
_Float32
and_Float64
might not exist, unless the implementation defines__STDC_IEC_60559_BFP__
and__STDC_IEC_60559_TYPES__
. If so:_Float32
exists, andfloat
has the same size and alignment as it (but is a distinct type)_Float64
exists, anddouble
has the same size and alignment as it (but is a distinct type)_FloatN
(typically_Float128
) exists iflong double
is a binaryN type with N > 64See C23 Standard - H.2.1 [Interchange floating types].
Extended floating types
_Float32x
,_Float64x
, etc. are so called extended floating types (named after IEC 60559 extended precision). Unlike their interchange counterparts, they only have minimum requirements for their representation, not exact requirements. A_FloatNx
must have ≥ N bits of precision, making it able to represent N-bit integers with no loss.These types might not exist, unless the implementation defines
__STDC_IEC_60559_TYPES__
. If so:_Float32x
exists if__STDC_IEC_60559_BFP__
is defined, and may have the same format asdouble
(but is a distinct type)_Float64x
exists if__STDC_IEC_60559_DFP__
is defined, and may have the same format aslong double
(but is a distinct type)_Float128x
optionally existsSee C23 Standard - H.2.3 [Extended floating types]
Aliases
_Float32_t
,_Float64_t
, etc. are aliases for other floating types, so that:_FloatN_t
has at least the range and precision of the corresponding real floating type (e.g._Float32_t
has the at least the range and precision of_Float32
if it exists)_Float64_t
can represent_Float32_t
)See C23 Standard - H.11 [Mathematics <math.h>].