Thanks to David Beazley's slides on Generators I'm quite taken with using generators for data processing in order to keep memory consumption minimal. Now I'm working on my first kedro project, and my question is how I can use generators in kedro. When I have a node that yields a generator, and then run it with kedro run --node=example_node, I get the following error:
DataSetError: Failed while saving data to data set MemoryDataSet().
can't pickle generator objects
Am I supposed to always load all my data into memory when working with kedro?
Hi @ilja to do this you may need to change the type of
assignmentoperation thatMemoryDataSetapplies.In your catalog, declare your datasets explicitly, change the
copy_modeto one ofcopyorassign. I thinkassignmay be your best bet here...https://kedro.readthedocs.io/en/stable/kedro.io.MemoryDataSet.html
I hope this works, but am not 100% sure.