Pandas pd.read_sql_query does not return the datetime/timestamp correctly (missing seconds)

1.6k Views Asked by At

Running the following query via pandas:

conn = sqlanydb.connect( userid='xx', password='xx', eng='xx', commlinks='tcpip{host=xx port=xx}' )
query = '''select top 20 log_ts  from my_table'''
sql_output = pd.read_sql_query(query, conn, parse_dates=['log_ts'])

results in log_ts: 2020-08-28 18:08:00

However the correct format (when checking the database UI) is 2020-08-28 18:08:57.752000

I also tried pd.read_sql_query(query, conn, parse_dates={'log_ts': {'format': '%Y-%m-%d %H:%M:%S']) and couldn't get it worked.

Any hint would be appreciated.

3

There are 3 best solutions below

2
On

You need MySQL v5.6.4+ for fractional seconds support.

Also, you need to not use the default sqlalchemy datetime type, you need to override the argument with the MySQL specific DATETIME type with defined milliseconds.

dtype={'date_time': DATETIME(fsp=6) } 
1
On

Might be your formatting missing the milliseconds. How about something like below.

import datetime
datetime.datetime.strptime('2020-08-28 18:08:57.752000', '%Y-%m-%d %H:%M:%S.%f' )
0
On

Interestingly, when I cast it to TIME only it gives me the seconds (no dates though). But when I cast it to DATETIME it ignores the seconds.

So, I can have the original column (without second) and have another column casted as TIME and merge them in python (post processing). It's not an efficient solution though.