When using the singer.write_records()
method we pass a stream id as a string and a record as dict, for example:
singer.write_records(
stream_name="example_stream",
records={"product1": "val1", "product2": "val2", "shop": "example_shop"}
)
But I have seen code where we pass a generator object to the records parameter instead of a dictionary, for example:
singer.write_records(
stream_name="products",
records=({**item, "shop": shop}
for item in retrieve_products(shop))
)
Why is this possible? Where does the singer spec explicitly define which arguments the write_records()
can take? How does the method process the data passed to the records field? I've looked up the singer specification but couldn't find any definition of write_records()
. I also tried running help(singer.write_records())
in the Python console, but the information printed wasn't helpful.
The singer spec is really just guidelines for the structure of messages (i.e. RECORD, SCHEMA, STATE) for communicating between a tap/target, it doesn’t have any requirements on the actual code that sends those messages, and technically Python isn’t a requirement. Check out Meltanos Singer spec docs for a better spec description. I believe the code you’re referring to is from the singer-python library which is a common utility library (although I’d check out the Meltano singer SDK for something more up to date). The write_records method looks like it just expects an iterable for the records parameter, so your dictionary is being iterated key by key or a generator will have one record retrieved on each iteration.