How can I rename a Zarr array without writing new store?

342 Views Asked by At

I have a Zarr datastore, but I need to rename one of the dimensions. Let's say I have this (from xarray docs):

data = np.random.rand(4, 3)
locs = ["IA", "IL", "IN"]
times = pd.date_range("2000-01-01", periods=4)

da = xr.DataArray(data, coords=[times, locs], dims=["time", "space"])
ds = xr.Dataset({'my_var': da})
ds.to_zarr("my_zarr.zarr")

dataset output screenshot

But I want to my space dimension to actually to be called state.

And I don't want to write a new Zarr store, I just want to change that one name.

How can I do that? (I've found one hacky way - see below)

2

There are 2 best solutions below

0
On

My hacky solution is to rename them manually:

  1. Rename the directory
cd my_zarr.zarr
mv space state
  1. Manually rename the dimension in the relevant .zattts files:

OLD my_zarr.zarr/my_var/.zattrs

{
    "_ARRAY_DIMENSIONS": [
        "time",
        "space"
    ]
}

OLD my_zarr.zarr/state/.zattrs

{
    "_ARRAY_DIMENSIONS": [
        "space"
    ]
}

NEW my_zarr.zarr/my_var/.zattrs

{
    "_ARRAY_DIMENSIONS": [
        "time",
        "state"
    ]
}

NEW my_zarr.zarr/state/.zattrs

{
    "_ARRAY_DIMENSIONS": [
        "state"
    ]
}

And voila: dataset output

But that's pretty hacky and I don't like how manual it is. It beats writing a whole new zarr store, but is there a better way?

0
On

While your answer will work @j sad, it's worth noting that there's a programmatic way to rename variables in via zarr's API:

store = zarr.open("my_zarr.zarr")
zarr.storage.rename(store, 'my_var', 'my_new_var')

This will rename a variable, simply by copying the data from the old name's folder into a new name folder and deleting the old -- essentially replacing the manual approach if the only goal is to rename a variable.

Your question is actually more complex because you are attempting to rename a dimension of the dataset shared across variables. Unfortunately, while you can use zarr.storage.rename to rename the coordinate array associated with the dimension you want to rename, it will not rename references to this dimension within variables in the way that xarray is expecting. Thus, the "Manually rename the dimension in the relevant .zattts files" step is still needed, as far as I can tell.