Reading and parsing JSON with ijson

46 Views Asked by At

I have some big json files with the following structure:

[
  {
    "url": "",
    "publishedDate": "",
    "modifiedDate": "",
    "title": "",
    "summary": "",
    "content": "",
    "language": "",
    "section": "",
    "tags": [],
    "authors": []
  },
  {
    "url": "",
    "publishedDate": "",
    "modifiedDate": "",
    "title": "",
    "summary": "",
    "content": "",
    "language": "",
    "section": "",
    "tags": [],
    "authors": []
  },
  ...
]

But serializing this big JSONs with the default python json library ends up consuming too much memory so I've searched for other alternatives. One of such is ijson which, is supposed to consume only the same amount in memory as the file size itself.

Problem is, I don't know how to use it (I'm new to python from a java perspective) and most tutorials I've found don't parse jsons like the one above. How can I make ijson yield dictionaries for each item in the json's list?

Thanks in advance.

0

There are 0 best solutions below