Pymongo Data Aggregation group by date and count of distinct column values for each date group

41 Views Asked by At

I have a table in mongodb database, which I am trying to aggregate using python. Table is in the following format-

[{'date': '01-01-2022', 'orderid': 1001},
 {'date': '01-01-2022', 'orderid': 1001},
 {'date': '01-01-2022', 'orderid': 1002},
 {'date': '02-01-2022', 'orderid': 1003},
 {'date': '02-01-2022', 'orderid': 1003},
 {'date': '02-01-2022', 'orderid': 1003},
 {'date': '02-01-2022', 'orderid': 1004},
 {'date': '02-01-2022', 'orderid': 1005},
 {'date': '03-01-2022', 'orderid': 1006},
 {'date': '03-01-2022', 'orderid': 1007}]

I want to group the data on the basis of date column and give count of unique orderid for each date group. Expected Output-

[{'date': '01-01-2022', 'count_orderid': 2},
 {'date': '02-01-2022', 'count_orderid': 3},
 {'date': '03-01-2022', 'count_orderid': 2}]

I have tried the following code, and various combinations but it gives error.

agg_result= collection.aggregate(
    [{
    "$group" : 
        {"_id" : "$date"
         }}.distinct('orderid').length
    ])
1

There are 1 best solutions below

0
rickhg12hs On BEST ANSWER

There currently isn't a "$group" accumulator operator that will do what you want. You could write a custom "$accumulator" in javascript, but there's an easier way by using a "$group" stage followed by a "$project" stage.

db.collection.aggregate([
  {
    "$group": {
      "_id": "$date",
      // make array of all unique orderid in group
      "orderIds": {"$addToSet": "$orderid"}
    }
  },
  {
    "$project": {
      "_id": 0,
      "date": "$_id",
      // number of unique orderid in group is the size of the array
      "count_orderid": {"$size": "$orderIds"}
    }
  }
])

Try it on mongoplayground.net.