I want to manage many files in such a way that the file stays on disk and my app work with part of the data.
I have to manage 2 types of files text-files/book-like, cvs-files/time-series. For every file I may generate multiple dimentionally reduced copies, which i want to keep and cache so i dont have to regenerate them.
I can see two ways of doing this:
1. create my own lib that uses mem-mapping
2. use tool as DASK
Dask seem like a good choice, but I can not find a way for the Bag object to iterate in a loop and/or range-access i.e.
for i in bag_obj[2:10] : .....
bag_obj[5:10]
I can only do .take()
Second is there a way to map a LIST to a file and do list operations as normal list as if it is in memory.
I came up with it , is this the best :
def slice(self, pfrom, pto):
assert self.bag is not None
self.bag.take(pto)[pfrom:]
but does not work cause returns computed() value ;(
this may be a solution ?