I'm getting a TypeError when converting .h5 (HDF5) file into .zarr format

573 Views Asked by At

I'm trying to convert the .h5 file to .zarr format but I'm getting folowing error

TypeError: Object of type bytes_ is not JSON serializable

I'm posting my code bellow

import h5py
import zarr
from sys import stdout

source = h5py.File('file.h5', 'r')
dest = zarr.open('file.zarr', 'w')
zarr.convenience.copy_all(source, dest, log=stdout, dry_run=False, if_exists='replace')

I checked the zarr documentation, some github issues but couldn't figure out how to solve this error.

I'm adding the links where I already looked but couldn't find (or understand to be honest) anything

https://zarr.readthedocs.io/en/stable/api/convenience.html#zarr.convenience.copy_all

https://github.com/zarr-developers/zarr-python/issues/87

Here's the traceback of the error

Traceback (most recent call last):
  File "/home/prk/Documents/IISER-stuff/scripts/h5_to_zarr.py", line 8, in <module>
    zarr.convenience.copy_all(source, dest, log=stdout, dry_run=False, if_exists='replace')
  File "/home/prk/Documents/IISER-stuff/venv/lib/python3.8/site-packages/zarr/convenience.py", line 1063, in copy_all
    c, s, b = _copy(
  File "/home/prk/Documents/IISER-stuff/venv/lib/python3.8/site-packages/zarr/convenience.py", line 903, in _copy
    ds.attrs.update(source.attrs)
  File "/home/prk/Documents/IISER-stuff/venv/lib/python3.8/site-packages/zarr/attrs.py", line 120, in update
    self._write_op(self._update_nosync, *args, **kwargs)
  File "/home/prk/Documents/IISER-stuff/venv/lib/python3.8/site-packages/zarr/attrs.py", line 74, in _write_op
    return f(*args, **kwargs)
  File "/home/prk/Documents/IISER-stuff/venv/lib/python3.8/site-packages/zarr/attrs.py", line 131, in _update_nosync
    self._put_nosync(d)
  File "/home/prk/Documents/IISER-stuff/venv/lib/python3.8/site-packages/zarr/attrs.py", line 113, in _put_nosync
    self.store[self.key] = json_dumps(d)
  File "/home/prk/Documents/IISER-stuff/venv/lib/python3.8/site-packages/zarr/util.py", line 24, in json_dumps
    return json.dumps(o, indent=4, sort_keys=True, ensure_ascii=True,
  File "/usr/lib/python3.8/json/__init__.py", line 234, in dumps
    return cls(
  File "/usr/lib/python3.8/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/usr/lib/python3.8/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/usr/lib/python3.8/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.8/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/usr/lib/python3.8/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type bytes_ is not JSON serializable
1

There are 1 best solutions below

0
On

Do you have the resources to try this command line tool: https://github.com/saalfeldlab/n5-utils

After installation call

n5-copy -i file.h5 -o file.zarr

It is may be just a glitch in how byte meta-data is handled by the JSON exporter and this may at least get you the copy you need.