I implemented an arbitrary precision floating point class using Boosts cpp_int for the mantissa and it works fine (I am not using their cpp_bin_floats because I need to be able to dynamically change precision).
However, I think it might be possible to optimize multiplication. The way I am doing this now is:
- Add the exponents.
- Multiply the mantissas. We get a mantissa (about) twice the size.
- Truncate / shift right the result mantissa, so it has the desired length and adjust the exponent accordingly
But that means, we calculated half the digits unnecessarily. Of course we can't just truncate the two multiplicands by half. That would give wrong results. Is there an amount I can truncate the multiplicands that gives no loss of precision?
One other thought is: When multiplying two 64 bit integers, I would need a 128 bit integer to hold the result. But I can also split the two multiplicands into high and low parts, calculate them crosswise and distribute the results into two 64 bit integers like this:
low = low_a * low_b;
mid1 = low_a * high_b;
mid2 = high_a * low_b;
high = high_a * high_b;
highResult = high + (mid1 >> 32) + (mid2 >> 32);
lowResult = low + (mid1 << 32) + (mid2 << 32);
Hopefully I didn't make a mistake here. But we can see: If we only keep highResult anyway, we don't need to calculate low_a * low_b (that might give us a "carry", but I think, that is not necessary for precision). So only 3 of the 4 multiplications are necessary. This holds for any number of bits my integers has. Also: Since I only use the high parts of the mid results, I could technically use a recursive approach.
Shouldn't that open up the possibility to accelerate multiplication? If so: How could I as efficiently as possible split two cpp_int? I could bitshift right by half the msb to get the high part. But I can't think of a quick operation to get the low part (first shifting right, then left again and then subtract from the original value would probably be slow for very long cpp_int, wouldn't it?).