I am looking for an algorithm to convert a float to a rational number, such that the rational number is guaranteed to evaluate back to the original float, and the denominator is minimized.
A naive algorithm can just return the actual value of the float as X / 2N, but that 2N tends to be pretty large for anything that is not a finite binary fraction. For example, the number 0.1, when stored in a double-precision float, is actually approximated by ³⁶⁰²⁸⁷⁹⁷⁰¹⁸⁹⁶³⁹⁷⁄₃₆₀₂₈₇₉₇₀₁₈₉₆₃₉₆₈ (the denominator being 255). However, converting 0.1 to ¹⁄₁₀ is obviously better, and ¹⁄₁₀ will evaluate to ³⁶⁰²⁸⁷⁹⁷⁰¹⁸⁹⁶³⁹⁷⁄₃₆₀₂₈₇₉₇₀₁₈₉₆₃₉₆₈ under floating point arithmetic.
A related problem is printing floats in decimal with the least amount of digits (this paper describes some techniques), which can be considered a specialized version of this problem with an additional constraint that the denominator must be a power of 10.
There is an existing questions, and likely more, but they don't have the constraint that the converted rational number must evaluate to the original float.
Let's start with a definition that pins down exactly which fraction we're looking for in any particular case:
So for example
5/7is simpler than6/7, and5/7is simpler than5/8, but neither2/5nor3/4is simpler than the other. (We don't have a total ordering here.)Then with this definition, there's a not-immediately-obvious theorem, that guarantees that the fraction we're looking for always exists:
In particular, the simplest fraction in an interval will always have smallest possible denominator, as required in the question. The "contains at least one fraction" condition in the theorem is there to exclude degenerate cases like the closed interval
[√2, √2], which doesn't contain any fractions at all.Our job is to write a function that takes a finite floating-point input
xand returns the simplest fractionn/dfor whichxis the closest float ton/d, in the target floating-point format. Assuming a reasonably sane floating-point format and rounding mode, the set of real numbers that round toxwill form a nonempty subinterval of the real line, with rational endpoints. So our problem breaks naturally into two subproblems:Problem 1. Given a float
xin the target floating-point format, describe the interval of all values that round toxunder the rules for that floating-point format. This involves both identifying the endpoints of that interval and establishing whether the interval is open, closed, or half-open.Problem 2. Given a nonempty subinterval
Jof the real line with rational endpoints, compute the simplest fraction in that subinterval.The second problem is more interesting and less dependent on platform and language details; let's tackle that one first.
Finding the simplest fraction in an interval
Assuming an IEEE 754 floating-point format and the default round-ties-to-even rounding mode, the interval rounding to a given float will be either open or closed; with other rounding modes or formats, it could potentially be half open (open at one end, closed at the other). So for this section, we only look at open and closed intervals, but adapting to half-open intervals isn't hard.
Suppose that
Jis a nonempty subinterval of the real line with rational endpoints. For simplicity, let's assume thatJis a subinterval of the positive real line. If it's not, then either it contains0— in which case0/1is the simplest fraction inJ— or it's a subinterval of the negative real line and we can negate, find the simplest fraction, and negate back.Then the following gives a simple recursive algorithm for finding the simplest fraction in
J:1, then1/1is the simplest fraction inJJis a subinterval of(0, 1), then the simplest fraction inJis1/f, wherefis the simplest fraction in1/J. (This is immediate from the definition of 'simplest'.)Jmust be a subinterval of(1, ∞), and the simplest fraction inJisq + f, whereqis the largest integer such thatJ - qis still within the positive reals, andfis the simplest fraction inJ - q.For a sketch of proof of the last statement: if
a / bis the simplest fraction inJandc / dis the simplest fraction inJ - q, thena / bis simpler than or equal to(c + qd) / d, andc / dis simpler than or equal to(a - qb) / b. Sob <= d,a <= c + qd,d <= bandc <= a - qb, and it follows thatb = danda = c + qd, soc / d = a / b - q.In Python-like pseudocode:
To see that the algorithm must always terminate and can't enter an infinite loop, note that every inversion step must be followed by a
J - qstep, and everyJ - qstep reduces the numerators of the left and right endpoints of the interval. Concretely, if the endpoints of the interval area/bandc/d, the sumabs(a) + abs(c) + b + dis a positive integer that steadily decreases as the algorithm progresses.To translate the above to real Python code, we have to deal with some details. First, let's assume for now that
Jis a closed interval; we'll adapt to open intervals below.We'll represent our interval by its endpoints
leftandright, both of which are positivefraction.Fractioninstances. Then the following Python code implements the above algorithm.Here's an example run:
In principle, the code for open intervals is equally simple, but in practice there's a complication: we may need to deal with infinite intervals. For example, if our original interval is
J = (2, 5/2), then the first step shifts that interval by2to get(0, 1/2), and then that interval is inverted to give(2, ∞).So for open intervals, we'll continue to represent our interval by a pair
(left, right)of its endpoints, but nowrightis either afractions.Fractioninstance or a special constantINFINITY. And instead of simply being able to use1 / leftto take the reciprocal of the left endpoint, we'll need an auxiliary function that can compute a reciprocal of something that's either a fraction orINFINITY, and another auxiliary function for subtraction, ensuring thatINFINITY - qgivesINFINITY. Here are those auxiliary functions:And here's the main function. Note the changes to the inequalities in the
ifandelifconditions, and the fact that we now want to usefloor(left)instead ofceil(left) - 1to find the largest integerqlying to the left of the interval:The above code is optimised for clarity, not efficiency: it's reasonably efficient in terms of big-Oh complexity, but not in terms of the implementation details. I leave it to the reader to convert to something more efficient. The first step would be to work with integer numerators and denominators throughout instead of
fractions.Fractioninstances. If you're interested in what that looks like, take a look at the implementation in my simplefractions package on PyPI.Finding the interval that rounds to a given float
Now that we can find the simplest fraction in a given interval, we need to solve the other half of the problem: finding the interval that rounds to a given floating-point number. The details of doing this will depend much more heavily on the language, the floating-point format in use, and even things like the rounding mode being used.
Here we outline one way to do this in Python, assuming IEEE 754 binary64 floating-point format with the default round-ties-to-even rounding mode.
For simplicity, assume that our input float
xis positive (and finite).Python >= 3.9 provides a function
math.nextafterthat allows us to retrieve the next floats up and down fromx. Here's an example of doing that for the float nearest π:(Note that to do this in general, we also need to deal with the special case where
xis the largest representable float andmath.nextafter(x, math.inf)gives infinity.)The bounds for the interval that rounds to
xare midway betweenxand the neighbouring floats. Python lets us convert floats to the corresponding exact value as a fraction:We also need to know whether we have a closed or an open interval. We could look at the bit representation to figure that out (it depends on whether the least significant bit of the float is
0or1), or we could just test to see whether our interval endpoints round toxor not:They do, so we have a closed interval. That's confirmed by looking at the hex representation of the float:
So we can find the simplest fraction that rounds to
xusingsimplest_in_closed_interval:Putting it all together
While the core algorithm is simple, there are enough corner cases to deal with (negative values, open versus closed intervals,
sys.float_info.max, etc.) that a complete solution ends up being a bit too messy to post in full in this answer. Some time ago I put together a Python package calledsimplefractionsthat deals with all those corner cases; it's available on PyPI. Here it is in action: