How to convert Vulgar Fractions to floats?

721 Views Asked by At

I am currently scraping Gmail data using the Gmail API. Some of the emails I am scraping contain vulgar fractions as seen below:

8⅜
6⅞
7¾
7⅞

The HTML outputs of the above vulgar fractions using the Gmail API are represented below:

8=E2=85=9C
6=E2=85=9E
7=C2=BE
7=E2=85=9E

How may I convert these back to strings such as '8 3/8', for processing in Python?

1

There are 1 best solutions below

6
On

The strings are encoded using the quoted printable encoding, a method of encoding non-ASCII bytes into ASCII. You can decode to str like this:

import quopri

s = '8=E2=85=9C'
f = quopri.decodestring(s).decode('utf-8')
print(f)

prints

8⅜

which is composed of str(8) plus the unicode character VULGAR FRACTION THREE EIGHTHS.

We can decompose the string further using unicode normalisation

import unicodedata as ud

decomposed = ud.normalize('NFKD', f)
print(decomposed)

outputs

83⁄8

We can combine the approaches to get all the parts of each string and cast them to ints or fractions:

import fractions
import quopri
import unicodedata as ud


values = """\
8=E2=85=9C
6=E2=85=9E
7=C2=BE
7=E2=85=9E
"""

for value in values.splitlines():
    string_ = quopri.decodestring(value).decode('utf-8')
    # Assume each string is composed solely of one or more digits,
    # with the fraction character at the end
    int_part = int(string_[:-1])

    normalised = ud.normalize('NFKD', string_[-1])
    # Note that the separator character here is chr(8260),
    # the 'FRACTION SLASH' character, not the ASCII 'SOLIDUS'
    nominator, _, denominator = normalised.partition('⁄')

    fractional_part = fractions.Fraction(*map(int, (nominator, denominator)))

    print(f'Integer part {int_part}, fractional part {fractional_part!r}')
print()

Result:

Integer part 8, fractional part Fraction(3, 8)
Integer part 6, fractional part Fraction(7, 8)
Integer part 7, fractional part Fraction(3, 4)
Integer part 7, fractional part Fraction(7, 8)

Fraction instances may be converted to float or str in the usual way:

>>> ff = fractions.Fraction(15, 8)
>>> ff
Fraction(15, 8)
>>> str(ff)
'15/8'
>>> float(ff)
1.875