Opening HDF5 file without modifying file timestamp

55 Views Asked by At

I am currently writing a function in R to convert the output of an external program (in HDF5) on a Linux machine to a different file format. I would need to retain the timestamps due to the way my pipeline is structured (mainly for reproducibility purposes).

My function currently just wraps rhdf5::H5Fopen() (with extra data transformation)

function(path_to_file){
  data <- rhdf5::H5Fopen(path_to_file,
    # preserve original file structure
    native = TRUE
  )

  data <- as.data.frame(data[["slot1"]])

  return(data)
}

However, this causes the timestamp (when the file was last modified) to be modified every time I read the file through the function. Is there any way to retain the original timestamp when opening the file? Thanks

2

There are 2 best solutions below

0
Grimbough On BEST ANSWER

If you open the file in read only mode, then the timestamp won't be modified e.g.

library(rhdf5)

h5file <- '/tmp/h5ex_t_array.h5'
file.mtime( h5file )
#> [1] "2022-06-27 15:23:43 CEST"

fid <- rhdf5:::H5Fopen( h5file, flags = 'H5F_ACC_RDONLY' )
H5Fclose(fid)
file.mtime( h5file )
#> [1] "2022-06-27 15:23:43 CEST"

fid <- rhdf5:::H5Fopen( h5file )
H5Fclose(fid)
file.mtime( h5file )
#> [1] "2024-02-08 12:24:53 CET"

Remember that you should always pair an open operation with a close in HDF5, otherwise you'll end up with potential file lock issues and memory leaks. In this case that would be H5Fclose().

It might be easier to use h5read(), which uses read only by default and handles the closing of files automatically.

0
Billy34 On

Kind of a hack by getting modified time before opening and resetting it at the end.

function(path_to_file){
  mtime <- file.info(path_to_file)$mtime
  on.exit({
    Sys.setFileTime(path_to_file, mtime)
  })

  data <- rhdf5::H5Fopen(path_to_file,
    # preserve original file structure
    native = TRUE
  )

  data <- as.data.frame(data[["slot1"]])

  return(data)
}