I have seen like a huge amount of data write to cosmos DB from stream analytics job on a particular day. It was not supposed to write huge amount of documents in a day. I have to check if there is duplication of documents on that particular day.
Is there any query/any way to find out duplicate records in cosmos DB?
Quick answer is YES.Please use
distinctkeyword in the cosmos db query sql.And filter the_ts(System generated unix timestamp:https://learn.microsoft.com/en-us/azure/cosmos-db/databases-containers-items#properties-of-an-item)Something like:
Then you could delete the duplicate data using this bulk delete lib:https://github.com/Azure/azure-cosmosdb-bulkexecutor-dotnet-getting-started/tree/master/BulkDeleteSample.