Among the data frames I am analysing are columns with date information. I am interested in the time information (HH:MM) and I could get that initially with df['columnname'].dt.time. Now we switched from the string format "YYYY-MM-DD HH:MM" to ISO8601, so an examplary date string would be "2022-03-04T23:00:00+01:00".
Analysing the new data resulted in
Can only use .dt accessor with datetimelike values
I could reproduce this when I used a mix of both input string formats. Here is an example:
import pandas as pd
#d = {'Timestamp': ['2022-03-05T23:00:00+01:00', '2022-03-04T20:00:00+01:00', '2022-03-04T23:00:00+01:00']} #works
#d = {'Timestamp': ['2024-09-29 10:00', '2024-04-19 21:00', '2024-04-13 13:00']} #works, too
d = {'Timestamp': ['2022-03-05T23:00:00+01:00', '2022-03-04T20:00:00+01:00', '2024-04-13 13:00']} #does not work
df = pd.DataFrame(data=d)
df_times = pd.to_datetime(df['Timestamp'], errors = 'coerce')
print(df_times.dt.time)
Each d with homogeneous format works but mixed format results in the error above.
Now, I tried to use
df_times = pd.to_datetime(df['Timestamp'], **format='mixed'**, errors = 'coerce')
In this case, there is no error but every output value is NaT. Also in the case of homogeneous data strings.
I am confused by this behaviour because to me it seems like something in the regular scope of to_datetime.
Can anyone please help me understand this. I would like to get a solution with to_datetime working (instead of some string manipulation) because it would enable to access other date attributes later on as well.
Thanks in advance!