I am looking for an efficient Python implementation of a function that takes a decimal formatted string, e.g.
2.05000
200
0.012
and returns a tuple of two integers representing the significand and exponent of the input in base-10 floating point format, e.g.
(205,-2)
(2,2)
(12,-3)
List comprehension would be a nice bonus.
I have a gut feeling that there exists an efficient (and possibly Pythonic) way of doing this but it eludes me...
Solution applied to pandas
import pandas as pd
import numpy as np
ser1 = pd.Series(['2.05000', '- 2.05000', '00 205', '-205', '-0', '-0.0', '0.00205', '0', np.nan])
ser1 = ser1.str.replace(' ', '')
parts = ser1.str.split('.').apply(pd.Series)
# remove all white spaces
# strip leading zeros (even those after a minus sign)
parts.ix[:,0] = '-'*parts.ix[:,0].str.startswith('-') + parts.ix[:,0].str.lstrip('-').str.lstrip('0')
parts.ix[:,1] = parts.ix[:,1].fillna('') # fill non-existamt decimal places
exponents = -parts.ix[:,1].str.len()
parts.ix[:,0] += parts.ix[:,1] # append decimal places to digit before decimal point
parts.ix[:,1] = parts.ix[:,0].str.rstrip('0') # strip following zeros
exponents += parts.ix[:,0].str.len() - parts.ix[:,1].str.len()
parts.ix[:,1][(parts.ix[:,1] == '') | (parts.ix[:,1] == '-')] = '0'
significands = parts.ix[:,1].astype(float)
df2 = pd.DataFrame({'exponent': exponents, 'significand': significands})
df2
Input:
0 2.05000
1 - 2.05000
2 00 205
3 -205
4 -0
5 -0.0
6 0.00205
7 0
8 NaN
dtype: object
Output:
exponent significand
0 -2 205
1 -2 -205
2 0 205
3 0 -205
4 0 0
5 0 0
6 -5 205
7 0 0
8 NaN NaN
[9 rows x 2 columns]
Here's a straight-forward string processing solution.
I'll leave the handling of negative numbers as an exercise for the reader.