I want to make column 'ratio' that is the result after each value of the column 'amount' divides the last value of the column 'amount'. the data type of amount column is int64. After changing the data type to float, I also got the same 'NAN' value.
when doing dividing operations in pandas, i always get 'NAN' results. how can I solve the problem?
96 Views Asked by Paul At
4
There are 4 best solutions below
2
On
You could use shift like this:
import pandas as pd
data = {'amount': range(4,8), 'user_input': ['a', 'b', 'c', 'd']}
dt = pd.DataFrame.from_dict(data)
dt
# Out:
# amount user_input
# 0 4 a
# 1 5 b
# 2 6 c
# 3 7 d
dt['ratio'] = dt['amount']/dt['amount'].shift(1)
dt
# Out:
# amount user_input ratio
# 0 4 a NaN
# 1 5 b 1.250000
# 2 6 c 1.200000
# 3 7 d 1.166667
Note that if you have a division by zero you will get an inf and of course the first value in the 'ratio' column is undefined.
0
On
A different take of same approach:
import pandas as pd
data = {
'col1': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'],
'col2': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
}
# Convert data into DataFrame
df = pd.DataFrame(data)
df = df.assign(new_col = df['col2']/df['col2'].values[-1])
print(df)

When you do any math on several data frames or sequences, Pandas aligns on indexes and columns by default.
tail(1)returns not a single value (scalar) but a sequence with the last index of the original data. When you divide the column on the obtained sequence, data are merged on indexes and then divided on corresponding values. Since tail contains only the value with the last index, the merge ends up withnanvalues as corresponding divisors for all dividends except the last one. That's why you gotnaneverywhere except at the last position.To avoid this behavior, pass the divisor either as a number or a
numpy.array. In this case, it can be