Azure Stream Analytics - CosmosDB Output contains multiple rows and just one row per partition key

427 Views Asked by At

With Azure stream analytics job having IoTHub as input and document DB as output getting following warnings frequently -

Warning: CosmosDB Output contains multiple rows and just one row per partition key. If the output latency is higher than expected, consider choosing a partition key that contains at least several hundred records per partition key. For best performance, consider choosing the same partition key column for input and output.

i am using partition key and large numbers of data received by IoTHub per second for same partition key.

1

There are 1 best solutions below

0
On BEST ANSWER

Firstly, this is the warning, it will not cause your stream Analytics failed.

It suggests you consider changing your cosmosdb design.

A right design for partition key, will improve the performance of your cosmosdb when using it.

From this article.

It is important to choose a partition key property that has a number of distinct values, and lets you distribute your workload evenly across these values. As a natural artifact of partitioning, requests involving the same partition key are limited by the maximum throughput of a single partition. Additionally, the storage size for documents belonging to the same partition key is limited to 10GB. An ideal partition key is one that appears frequently as a filter in your queries and has sufficient cardinality to ensure your solution is scalable.

A partition key is also the boundary for transactions in DocumentDB's stored procedures and triggers. You should choose the partition key so that documents that occur together in transactions share the same partition key value.

About how to design the partition, you could refer to this article.