Read json from file.json.bz2 quickly

319 Views Asked by At

I'm trying to open a bz2 file and read the json file contained inside. My current implementation looks like

with bz2.open(bz2_file_path, 'rb') as f:
    json_content = f.read()
json_df = pd.read_json(json_content.decode('utf-8'), lines = True)

I need to repeat this process many times, and the the with block is taking up the bulk of the time. Is there a way which I can speed this process up?

1

There are 1 best solutions below

0
orip On

The following variation of your code won't necessarily read all the code into memory at once. Passing encoding to bz2.open() allows the decoding to be done on the fly, and panads.read_json() can accept a file-like object to read incrementally.

with bz2.open(bz2_file_path, 'rt', encoding='utf-8') as f:
  json_df = pd.read_json(f, lines=True)