how to achieve parallelism in Azure Stream Analytics

225 Views Asked by At

If my understanding is right, only one stream analytic jobs's instance runs at a time. And after it finishes with current set of events, next set of events is pulled from event hub. So if is sequential.

If processing takes 20 milliseconds, other events will have to wait for that many milliseconds. I was just wondering that if this sequential operation will be sufficient in production load?

I am aware of PartitionBy clause, but since we are using IoT hub, we cannot use partitionId/ PartitionKey.

Thanks In advance

1

There are 1 best solutions below

0
On

All messages with the same deviceId are sent to the same partitionId. If your query only ever looks at one deviceId at a time, you can still use partitionId and process each partition independently. Examples of this are queries with Select, Filter only queries and Aggregates that include deviceid in the key.

If your queries look at multiple deviceIds at a time (for example, counting total number of messages in a window), you have two options. You can do partial aggregates first in parallel and then combine them together to get global aggregate. Or just use a query without partition by.

Also, Azure stream analytics does not get messages one by one to incur the kind of delays that you mentioned in the question.