I've got a HDF5 file with the following structure viewed with h5dump:
❯ h5dump -n GMTCO_npp_d20181005_t2022358_e2024003_b35959_c20181008035331888329_cspp_dev.h5
HDF5 "GMTCO_npp_d20181005_t2022358_e2024003_b35959_c20181008035331888329_cspp_dev.h5" {
FILE_CONTENTS {
group /
group /All_Data
group /All_Data/VIIRS-MOD-GEO-TC_All
dataset /All_Data/VIIRS-MOD-GEO-TC_All/Height
dataset /All_Data/VIIRS-MOD-GEO-TC_All/Latitude
dataset /All_Data/VIIRS-MOD-GEO-TC_All/Longitude
...
group /Data_Products
group /Data_Products/VIIRS-MOD-GEO-TC
dataset /Data_Products/VIIRS-MOD-GEO-TC/VIIRS-MOD-GEO-TC_Aggr
dataset /Data_Products/VIIRS-MOD-GEO-TC/VIIRS-MOD-GEO-TC_Gran_0
}
}
I am interested in using Rust (via the hdf5-rust crate) to read a string attribute of the dataset /Data_Products/VIIRS-MOD-GEO-TC/VIIRS-MOD-GEO-TC_Gran_0, which has the signature
ATTRIBUTE "N_Granule_ID" {
DATATYPE H5T_STRING {
STRSIZE 16;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE { ( 1, 1 ) / ( 1, 1 ) }
DATA {
(0,0): "NPP002194429582"
}
}
I tried the following...
use anyhow::{Ok, Result};
use hdf5::File;
use ndarray::{Array, Array2};
fn main() -> Result<()> {
filename = "GMTCO_npp_d20181005_t2022358_e2024003_b35959_c20181008035331888329_cspp_dev.h5".to_string();
let file = File::open(filename)?;
let dataset = file.dataset("Data_Products/VIIRS-MOD-GEO-TC/VIIRS-MOD-GEO-TC_Gran_0")?;
let attribute = dataset.attr("N_Granule_ID")?;
// Don't know what to use here...
let v: Array2<String> = attribute.read_2d::<String>()?;
Ok(())
}
which seems to work up until I need to read the contents of the attribute object (attribute.read_2d() etc...) into a rust datatype. From the DATASPACE SIMPLE { ( 1, 1 ) / ( 1, 1 ) } entry in the attribute metadata I think the attribute is supposed to be read into a 2D array with a single entry (i.e.: (1x1)), but I'm not really sure which read method and datatype to use.
The only example provided with the hdf5-rust package reads a compound enum-based attribute using
attribute = attr.read_1d::<Color>()?
where Color is a user-defined enum datatype which is registered as a HDF5 dataset by deriving H5Type
#[derive(H5Type, Clone, PartialEq, Debug)] // register with HDF5
#[repr(u8)]
pub enum Color {
R = 1,
G = 2,
B = 3,
}
How would one do this for a non-compound datatype (f32, i32, String)?
I got a tip from the one of the
hdf5-rustcontributors that I should be usingFixedAscii<size>. For an attribute attached to the root groupI did
or alternatively
and they both gave the result
and I got to the attribute payload with
which is what I was after. For the dataset attribute referenced in the original question I used
giving
Luckily the attributes I am interested in have fixed sizes which I know ahead of time.
I was also able to read in a "vector" string attribute (something like a list of filenames), with the signature
where
STRSIZE=104is the length of the longest string (number of chars plus terminator?). The filenames are of differing sizes, but as long as the argument toFixedAscii<>is equal or greater than the longest filename, it works...giving
This basically covers the most complicated use case for the files I am reading.