Querying a HDF-store

469 Views Asked by At

I created a hd5 file by

hdf=pandas.HDFStore(pfad)
hdf.append('df', df, data_columns=True)

I have a list that contains numpy.datetime64 values called expirations and try to read the portion of the hd5 table into a dataframe, that has values between expirations[1] and expirations[0] in column "expiration". Column expiration entries have the format Timestamp('2002-05-18 00:00:00').

I use the following command:

df=hdf.select('df', where=('expiration<expiration[1] & expiration>=expirations[0]'))

However, I get ValueError: Unable to parse x How should this be correctly done?

df.dtypes
Out[37]: 
adjusted stock close price           float64
expiration                    datetime64[ns]
strike                                 int64
call put                              object
ask                                  float64
bid                                  float64
volume                                 int64
open interest                          int64
unadjusted stock price               float64

df.info
Out[36]: 
<bound method DataFrame.info of             adjusted stock close price expiration  strike call put      ask  date                                                                          
2002-05-16                     5047.00 2002-05-18    4300        C  802.000   

There is more columns but they aren't of interest for the query.

1

There are 1 best solutions below

0
On

Problem solved!

I obtained expirations by

 df_expirations=df.drop_duplicates(subset='expiration')
 expirations=df['expiration'].values

This obviously changed the number format from datetime into tz datetime. I reingeneered this by using

 expirations=df['expirations']

Now this query is working: del df df=hdf.select('df', where=('expiration=expirations[1]'))

Thanks for pointing me on the datetime format problem.