I have seen like a huge amount of data write to cosmos DB from stream analytics job on a particular day. It was not supposed to write huge amount of documents in a day. I have to check if there is duplication of documents on that particular day.
Is there any query/any way to find out duplicate records in cosmos DB?
It is possible if you know the properties to check for duplicates. We had a nasty production issue causing many duplicate records as well. Upon contacting MS Support to help us identify the duplicate documents, they gave us the following query;
Bear in mind: property A and B together define the uniqueness in our case. So if two documents have the same value for A and B, they are duplicate. You can then use the output of this query to, for example, delete the oldest ones but keep the recent (based on _ts)