I have a JSON which can contain over a million records (each record is a simple object with some fields, but the heirarchy to get to it is contain about 5 levels). I need to find the records containing some values for the fields, preferably in a generic way in node.js.
I tried jsonpath-plus which does exactly what I want. The problem is that processing that much data takes about 25 seconds (if I return only the data, without the path, it takes 10s).
I tried json_query (which is an adaptation of DOJO JSonQuery to node.js). This is working really fast (1s) but only returns the data and not the path to the data.
I was wondering if you can think of alternatives I can use or how can I make jsonpath-plus work faster.
Clarification: I don't generate the data. I receive it with no way of controlling that. I receive the full JSON blob and then I have to perform a few (about 5) queries on it before I get a new one.
Sincerely, Elad
But, you could load it into a database and have it generate an index for you, pre-optimized for the queries you need. (Alternatively, you could build such an index in your own application.)
You'll have to determine whether or not building an index to run these queries is worth it. If you're only doing 5 queries, and they require a full search of the data, then it might be faster just to slog through it the way you are now.
One other thought... is this line-delimited JSON where each record is its own object? If so, you could parallelize this searching.