Getting multiple datasets from group in HDF5

3.4k Views Asked by At

I am comparing two different hdf5 files to make sure that they match. I want to create a list with all of the datasets in the group in the hdf5 file so that I can have a loop run through all of the datasets, instead of entering them manually. I cant seem to find away to do this. Currently I am getting the data set by using this code:

tdata21 = ft['/PACKET_0/0xeda9_data_0004']

The names of the sets are located in the "PACKET_0" group. Once I arrange all of the datasets, I compare the data in the datasets in this loop:

for i in range(len(data1)):
   print "%d\t%g\t%g" % (i, data1[i],tdata1[i])
   if(data1[i]!=tdata1[i]):
     x="data file: data1 \nline:"+ str(i) + "\norgianl data:"  + str(data1[i]) + "\nrecieved data:" + str(tdata1[i]) + "\n\n"
     correct.append(x)

If there is an smartier way to compare hdf5 files I would like to see it as will, but mainly I am just looking for a way to get the names of all of the datasets in the group into a list. Thank you

1

There are 1 best solutions below

0
On BEST ANSWER

To get the datasets or groups that exist in an HDF5 group or file, just call list() on that group or file. Using your example, you'd have

datasets = list(ft['/PACKET_0'])

You can also just iterate over them directly, by doing:

for name, data in ft['/PACKET_0'].items():
    # do stuff for each dataset

If you want to compare two datasets for equality (i.e., they have the same data), the easiest way would be to do this:

(dataset1.value == dataset2.value).all()

This returns NumPy arrays from each dataset, compares those arrays element-by-element, and returns True if they match everywhere and False otherwise.

You can combine these two concepts to compare every dataset in two different files.