I say "accurate" because IEEE-754 doesn't accurately represent decimal numbers, which seems to be the crux of the matter.
All decimal numbers can be represented in scientific notation, as illustrated below. I have included a column for the unscaled value, which is the decimal value with all significant digits shifted to the left of the decimal point; for example:
General | Scientific | Unscaled |
---|---|---|
0.01 | 1.01e+0 | 101 |
0.025 | 2.5e-2 | 25 |
123 | 1.23e+2 | 123 |
12345678 | 1.2345678e+7 | 12345678 |
123.456 | 1.23456e+2 | 123456 |
12345.678901234567 | 1.2345678901234567e+4 | 12345678901234567 |
System.Double.MaxValue | 1.7976931348623157e+308 | 17976931348623157 * (10 ^ 308) |
The following code utilises .NET 7.0 generic math to obtain the exponent and mantissa from the IEEE-754 value, respectively. Note that they obtain the decimal (base 10) exponent and mantissa, not the binary (base 2) representation of them:
GetExponent
private static T GetExponent<T>(T value) where T : IBinaryFloatingPointIeee754<T>
{
if (T.IsNaN(value) || T.IsInfinity(value) || T.IsZero(value)) return T.Zero;
T absValue = T.Abs(value);
T log10 = T.Log10(absValue);
return T.Floor(log10);
}
GetMantissa
private static T GetMantissa<T>(T value) where T : IBinaryFloatingPointIeee754<T>
{
T ten = T.CreateChecked(10);
T exponent = GetExponent(value);
T factor = T.Pow(ten, exponent);
return value / factor;
}
Given the table above, these functions produce the following values:
Scientific | Mantissa | Exponent |
---|---|---|
1.01e+0 | 1.01 | 0 |
2.5e-2 | 2.5 | -2 |
1.23e+2 | 1.23 | 2 |
1.2345678e+7 | 1.2345678 | 7 |
1.23456e+2 | 1.23456 | 2 |
1.2345678901234567e+4 | 1.2345678901234567 | 4 |
1.7976931348623157e+308 | 1.7976931348623157 | 308 |
So far, so good! Now I want to multiply the mantissa by 10, until all significant digits are to the left of the decimal point. The following function obtains the unscaled mantissa:
private static BigInteger GetUnscaledMantissa<T>(T value) where T : IBinaryFloatingPointIeee754<T>
{
T ten = T.CreateChecked(10);
T mantissa = GetMantissa(value);
T factor = T.One;
// While the remainder isn't zero...
while (mantissa * factor % T.One != T.Zero)
{
// ..multiply factor by 10.
factor *= ten;
}
BigInteger result = BigInteger.CreateChecked(mantissa * factor);
// trim any trailing zeros, which sometimes occurs.
while (result % 10 == 0) result /= 10;
return result;
}
Let's take a look at the results:
Mantissa | Unscaled |
---|---|
1.01 | 101 |
2.5 | 25 |
1.23 | 123 |
1.2345678 | 12345678 |
1.23456 | 123456 |
1.2345678901234567 | 12345678901234566 |
1.7976931348623157 | 17976931348623158 |
Generally speaking, the GetUnscaledMantissa
function returns the correct value, however, notice the outliers highlighted in bold; they are not quite correct. It seems that in some cases the value is rounded up or down.
Question
Whilst I understand that this is just the nature of IEEE-754 binary floating point numbers, is there a way, or how could I modify the GetUnscaledMantissa
function, so that it accurately returns the unscaled mantissa in all, or at least more/most cases?
(Note, I know that this is possible, it's just not trivial)
Update
Given the extended conversation on this topic, it seems that there is some confusion as to what I am trying to achieve, so hopefully the following goes some way to setting the record straight.
Forget IEEE-754 for the time being! Let's just focus on some pure maths.
The following numbers are expressed in scientific notation, and their equivalent full form:
Table A
Scientific | Full |
---|---|
1.00001e+50 | 100001000000000000000000000000000000000000000000000 |
-3.345233391e+45 | -3345233391000000000000000000000000000000000000 |
1.7976931348623157E+308 | 179769313486231570000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 |
These values in scientific notation can be verified in WolframAlpha; the maths checks out!
Now, the confusion seems to be around what happens when you represent the same numbers with IEEE-754.
The following numbers are expressed in scientific notation, and their equivalent full form, except this time, the full number is based on the IEEE-754 calculation:
Table B
Scientific | Full |
---|---|
1.00001e+50 | 100000999999999993488106414884841393099338118332416 |
-3.345233391e+45 | -3345233391000000093465768949128568879123005440 |
1.7976931348623157E+308 | 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368 |
What I want GetUnscaledMantissa
to return is the mathematically correct unscaled number (in Table A), not the IEEE-754 "correct" unscaled number (in Table B).