I'm doing some data munging which would be quite a bit simpler if I could stick a bunch of dictionaries in an in-memory database, then run simply queries against it.
For example, something like:
people = db([
{"name": "Joe", "age": 16},
{"name": "Jane", "favourite_color": "red"},
])
over_16 = db.filter(age__gt=16)
with_favorite_colors = db.filter(favorite_color__exists=True)
There are three confounding factors, though:
- Some of the values will be Python objects, and serializing them is out of the question (too slow, breaks identity). Of course, I could work around this (eg, by storing all the items in a big list, then serializing their indexes in that list… But that could take a fair bit of fiddling).
- There will be thousands of data, and I will be running lookup-heavy operations (like graph traversals) against them, so it must be possible to perform efficient (ie, indexed) queries.
- As in the example, the data is unstructured, so systems which require me to predefine a schema would be tricky.
So, does such a thing exist? Or will I need to kludge something together?
I started developing one yesterday and it isn't published yet. It indexes your objects and allows you to run fast queries. All data is kept in RAM and I'm thinking about smart load and save methods. For testing purposes it is loading and saving through cPickle.
Let me know if you are still interested.