How do I prevent xarray.Dataset.to_netcdf() forcibly reordering the dimensions to put 'time' first?

39 Views Asked by At

I have a netCDF file, data.nc.ncdump -h data.nc shows that the dimensions are:

dimensions:
    cell = 20480 ;
    nv = 3 ;
    time = UNLIMITED ; // (12 currently)

In a Jupyter notebook, when I read this file in and examine it:
data = xr.open_dataset('data.nc')
data

I get the expected output of 'Dimensions: (cell: 20480, nv: 3, time: 12)'.

All fine so far. However, if I then save a copy of data as a new netCDF file:
data.to_netcdf(path='data_copy.nc')
ncdump -h data_copy.nc shows:

dimensions:
    time = UNLIMITED ; // (12 currently)
    cell = 20480 ;
    nv = 3 ;

Oddly enough, though, if I read in this copy with:
data_copy = xr.open_dataset('data_copy.nc')
data_copy

I correctly get the same 'Dimensions: (cell: 20480, nv: 3, time: 12)' as the original.

I thought that this might have something to do with netCDF versions, as in this answer.
ncdump -k data.nc shows classic, which seems particularly weird as the answer says "there's no way to make time unlimited and have it be the last dimension in a netCDF3 file" – but that's the precise situation with data.nc, a netCDF3 file with time as the unlimited last dimension.

I have tried several format options in the xarray.Dataset.to_netcdf documentation, e.g. data.to_netcdf(path='data_copy.nc', format='NETCDF4'), but all of them still show

dimensions:
    time = UNLIMITED ; // (12 currently)
    cell = 20480 ;
    nv = 3 ;

with ncdump, and yet the correct order when read back in with xr.open_dataset and examined as Datasets.

I've also tried specifying engine='netcdf4' when saving as netcdf4 and unlimited_dims='time', but the dimension order of my saved copy has 'time' first no matter what when checked with ncdump.

I've read every related question I can think of, but the two most frequent suggestions don't seem applicable. I don't want to reorder dimensions with ncpdq, because that changes those dimensions internally for each variable, not for the file as a whole, and I would rather prevent the problem than correct it. It also doesn't seem like a case for xarray.Dataset.transpose(), because the dimensions are already correct when the data is in Dataset form.

I've also tried reordering the dimensions of data_copy.nc with ncks as outlined here, but ncks -A -v cell data_copy.nc outfile.nc gave me:

ncks: ERROR nco_xtr_mk() reports user-supplied variable name 
or regular expression 'cell' is not in and/or does not match 
contents of input file

which I don't understand (and comes back to the issue of prevention being preferable).

Why does this dimension reordering happen when I save the Dataset as a netCDF file, and how can I prevent it?

0

There are 0 best solutions below