This works to write and load a numpy array + metadata in a .npz compressed file (here the compression is useless because it's random, but anyway):
import numpy as np
# save
D = {"x": np.random.random((10000, 1000)), "metadata": {"date": "20221123", "user": "bob", "name": "abc"}}
with open("test.npz", "wb") as f:
np.savez_compressed(f, **D)
# load
D2 = np.load("test.npz", allow_pickle=True)
print(D2["x"])
print(D2["metadata"].item()["date"])
Let's say we want to change only a metadata:
D["metadata"]["name"] = "xyz"
Is there a way to re-write to disk in test.npz only D["metadata"] and not the whole file because D["x"] has not changed?
In my case, the .npz file can be 100 MB to 4 GB large, that's why it would be interesting to rewrite only the metadata.
Ultimately the solution that I could get to work (thus far) is the one I originally thought of with
zipfile.Let's say we want to change
metadata:np.loadreturns anNpzFile, which is a lazy loader. However,NpzFileobjects aren't directly writeable. We cannot also do something likeD["metadata"] = new_metadatauntilDhas been converted to a dict, and that loses the lazy functionality.