How to remove microseconds from timedelta

4.1k Views Asked by At

I have microseconds that I want to essentially truncate from a pandas column. I tried something like analyze_me['how_long_it_took_to_order'] = analyze_me['how_long_it_took_to_order'].apply(lambda x: x.replace(microsecond=0) but to this error came up replace() takes no keyword arguments.

For example: I want 00:19:58.582052 to become 00:19:58 or 00:19:58.58

enter image description here

3

There are 3 best solutions below

7
On BEST ANSWER

your how_long_it_took_to_order column seems to be of string (object) dtype.

So try this:

analyze_me['how_long_it_took_to_order'] = \
    analyze_me['how_long_it_took_to_order'].str.split('.').str[0]

or:

analyze_me['how_long_it_took_to_order'] = \
    analyze_me['how_long_it_took_to_order'].str.replace('(\.\d{2})\d+', r'\1')

for "centiseconds", like: 00:19:58.58

7
On

I think you need to convert your string in to a timedelta with pd.to_timedelta and then take advantage of the excellent dt accessor with the floor method which truncates based on string. Here are the first two rows of your data.

df['how_long_it_took_to_order'] = pd.to_timedelta(df['how_long_it_took_to_order'])
df['how_long_it_took_to_order'].dt.floor('s')

0   00:19:58
1   00:25:09

Can round to the hundredth of a second.

df['how_long_it_took_to_order'].dt.floor('10ms')

0   00:19:58.580000
1   00:25:09.100000

Here I create some a Series of timedeltas and then use the dt accessor with the floor method to truncate down to the nearest microsecond.

d = pd.timedelta_range(0, periods=6, freq='644257us')
s = pd.Series(d)
s

0          00:00:00
1   00:00:00.644257
2   00:00:01.288514
3   00:00:01.932771
4   00:00:02.577028
5   00:00:03.221285
dtype: timedelta64[ns]

Now truncate

s.dt.floor('s')

0   00:00:00
1   00:00:00
2   00:00:01
3   00:00:01
4   00:00:02
5   00:00:03
dtype: timedelta64[ns]

If you want to truncate to the nearest hundredth of a second do this:

s.dt.floor('10ms')

0          00:00:00
1   00:00:00.640000
2   00:00:01.280000
3   00:00:01.930000
4   00:00:02.570000
5   00:00:03.220000
dtype: timedelta64[ns]
0
On

I needed this for a simple script where I wasn't using Pandas, and came up with a simple hack which should work everywhere.

age = age - timedelta(microseconds=age.microseconds)

where age is my timedelta object.

You can't directly modify the microseconds member of a timedelta object because it's immutable, but of course, you can replace it with another immutable object.