Question:
1、how to select the rows(Pseudo code) : columns['Name']='Name_A' (Name_A just a example) & columns['time'] isin (2021-11-21 00:00:00,2021-11-22 00:00:00) .
I have store about 4 billion rows data to a hdf5 file.
Now, I want to select some data.
My code like this:
import pandas as pd
ss = pd.HDFStore("xh_data_L9.hdf5") #<class 'pandas.io.pytables.HDFStore'>
print(type(ss))
print(ss.keys())
s_1 = ss.select('alldata',start=0,stop=500) # data example
print(s_1)
ss.close()
I found HDFStore.select usage like this:
HDFStore.select(key, where=None, start=None, stop=None, columns=None, iterator=False, chunksize=None, auto_close=False)
# can not run success.
s_3 = ss.select('alldata',where="Time>2021-11-21 00:00:00 & Time<2021-11-22 00:00:00)")
s_3 = ss.select('alldata',['Name'] == 'Name_A')
I have google some method,but don't how to use "where"
I found that the reason was whether the data_columns was established when the file was created.
if the result include " dc->[Time,Name,Value]".
The following is the official website explanation for data_columns: