pandas: subtracting datetime64 columns with duplicate indices causes alignment problems

224 Views Asked by At

I'm really scratching my head on this one.

Apparently, if you have a data frame with duplicate indices, subtracting columns consisting of datetimes breaks alignment, whereas subtracting regular int columns does not. Does "subtraction" mean something different in this context?

In [371]: import numpy as np

In [372]: import pandas as pd

In [373]: sec = np.datetime64(1, 's')

In [374]: df = pd.DataFrame({'a' : [sec, sec], 'b' : [0,0]}, index = [0,0])

In [375]: df['b'] - df['b']
Out[375]:
0    0
0    0
Name: b, dtype: int64

All good so far, that's what I'd expect.

In [376]: df['a'] - df['a']
Out[376]:
0   0 days
0   0 days
0   0 days
0   0 days
Name: a, dtype: timedelta64[ns]

What???

Same thing happens even if the col values themselves are not duplicates.

In [377]: df['a'].iloc[0] = np.datetime64(50, 's')

In [378]: df['a'] - df['a']
Out[378]:
0            00:00:00
0   -00:00:00.1000000
0    00:00:00.1000000
0            00:00:00
Name: a, dtype: timedelta64[ns]
0

There are 0 best solutions below