I am comparing two different hdf5 files to make sure that they match. I want to create a list with all of the datasets in the group in the hdf5 file so that I can have a loop run through all of the datasets, instead of entering them manually. I cant seem to find away to do this. Currently I am getting the data set by using this code:
tdata21 = ft['/PACKET_0/0xeda9_data_0004']
The names of the sets are located in the "PACKET_0" group. Once I arrange all of the datasets, I compare the data in the datasets in this loop:
for i in range(len(data1)):
print "%d\t%g\t%g" % (i, data1[i],tdata1[i])
if(data1[i]!=tdata1[i]):
x="data file: data1 \nline:"+ str(i) + "\norgianl data:" + str(data1[i]) + "\nrecieved data:" + str(tdata1[i]) + "\n\n"
correct.append(x)
If there is an smartier way to compare hdf5 files I would like to see it as will, but mainly I am just looking for a way to get the names of all of the datasets in the group into a list. Thank you
To get the datasets or groups that exist in an HDF5 group or file, just call
list()
on that group or file. Using your example, you'd haveYou can also just iterate over them directly, by doing:
If you want to compare two datasets for equality (i.e., they have the same data), the easiest way would be to do this:
This returns NumPy arrays from each dataset, compares those arrays element-by-element, and returns
True
if they match everywhere andFalse
otherwise.You can combine these two concepts to compare every dataset in two different files.