Pandas pd.read_sql_query does not return the datetime/timestamp correctly (missing seconds)

1.6k Views Asked by At

Running the following query via pandas:

conn = sqlanydb.connect( userid='xx', password='xx', eng='xx', commlinks='tcpip{host=xx port=xx}' )
query = '''select top 20 log_ts  from my_table'''
sql_output = pd.read_sql_query(query, conn, parse_dates=['log_ts'])

results in log_ts: 2020-08-28 18:08:00

However the correct format (when checking the database UI) is 2020-08-28 18:08:57.752000

I also tried pd.read_sql_query(query, conn, parse_dates={'log_ts': {'format': '%Y-%m-%d %H:%M:%S']) and couldn't get it worked.

Any hint would be appreciated.

3

There are 3 best solutions below

2
Paul Brennan On

You need MySQL v5.6.4+ for fractional seconds support.

Also, you need to not use the default sqlalchemy datetime type, you need to override the argument with the MySQL specific DATETIME type with defined milliseconds.

dtype={'date_time': DATETIME(fsp=6) } 
1
BennyLava On

Might be your formatting missing the milliseconds. How about something like below.

import datetime
datetime.datetime.strptime('2020-08-28 18:08:57.752000', '%Y-%m-%d %H:%M:%S.%f' )
0
mask On

Interestingly, when I cast it to TIME only it gives me the seconds (no dates though). But when I cast it to DATETIME it ignores the seconds.

So, I can have the original column (without second) and have another column casted as TIME and merge them in python (post processing). It's not an efficient solution though.