I didn't realize how perilous such a simple task could be.
We're trying to stream-read a JSON file stored in S3--I think we have that part working.
Our .on('data') callback is getting called, but Node picks and chooses what bits it wants to run--seemingly at random.
We set up a stream reader.
stream.on('data', async x => {
await saveToDb(x); // This doesn't await. It processes saveToDb up until it awaits.
});
Sometimes the db call makes it to the db--but most of the time it doesn't. I've come to the conclusion that EventEmitter has problems with async/await event handlers. It appears as though it will play along with your async method so long as your code is synchronous. But, at the point you await, it randomly decides whether to actually follow through with doing it or not.
It streams the various chunks and we can console.log them out and see the data. But as soon as we try to fire off an await/async call, we stop seeing reliable messages.
I'm running this in AWS Lambda and I've been told that there are special considerations because apparently they halt processing in some cases?
I tried surrounding the await call in an IFFY, but that didn't work, either.
What am I missing? Is there no way of telling JavaScript--"Okay, I need you to run this async task synchronously. I mean it--don't go and fire off any more event notifications, either. Just sit here and wait."?
TL;DR:
Details:
The secret to life's mystery regarding
async/awaitand streams appears to be wrapped up inAsync Iterators!In short, I piped some streams together and at the very end, I created an async iterator to pull stuff out of the end so that I could asynchronously call the db. The only thing ChunkStream does for me is to queue up to 1,000 to call the db with instead of for each item. I'm new to queues, so there may already be a better way of doing that.