I have a mongodb database with many users and one of the subdocuments I track is file uploads and their statuses through a review process. Every file upload will have an attachment status eventually. I want to be able to pull some metrics to get the total of the current statuses for each uploaded file. I started building an aggregate query that pulls the latest attachment subdocument status from each file uploaded and count them.
The data structure is as follows:
"userName": "johnDoe",
"email": "[email protected]",
"uploads" : [
{
"_id" : ObjectId("adh12451e0012ce9da0"),
"fileName" : "TestDoc.txt",
"fileType" : "text/plain",
"created" : ISODate("2021-01-06T15:26:14.166Z"),
"attachmentStatus" : [ ]
},
{
"_id" : ObjectId("5ff5d6c066cacc0012ed655a"),
"fileName" : "testerABC.txt",
"fileType" : "text/plain",
"created" : ISODate("2021-01-06T15:26:56.027Z"),
"attachmentStatus" : [
{
"_id" : ObjectId("60884f733f88bd00129b9ad4"),
"status" : "Uploaded",
"date" : ISODate("2021-04-22T02:23:00Z")
},
{
"_id" : ObjectId("60884f733f88bd00129b9ad5"),
"status" : "Processing",
"date" : ISODate("2021-04-26T04:54:00Z")
}
]
},
{
"_id" : ObjectId("6075c82a19fdcc0012f81907"),
"fileName" : "Test file.docx",
"fileType" : "application/word",
"created" : ISODate("2021-04-13T16:34:50.955Z"),
"attachmentStatus" : [
{
"_id" : ObjectId("72844f733f88bd11479b9ad7"),
"status" : "Uploaded",
"date" : ISODate("2021-04-23T03:42:00Z")
},
{
"_id" : ObjectId("724986d73f88bd00147c9wt8"),
"status" : "Completed",
"date" : ISODate("2021-04-24T01:37:00Z")
}
]
}
]
"userName": "janeDoe",
"email": "[email protected]",
"uploads" : [
{
"_id" : ObjectId("ej9784652h0012ce9da0"),
"fileName" : "myResume.txt",
"fileType" : "text/plain",
"created" : ISODate("2021-02-13T12:36:14.166Z"),
"attachmentStatus" : [
{
"_id" : ObjectId("15dhdf6f88bd00147c9wt8"),
"status" : "Completed",
"date" : ISODate("2021-04-24T01:37:00Z")
}
]
},
How can I pull the latest attachment status out for each file uploaded and then summarize the statuses?
I want something like this:
{ "status" : "Completed", "Count" : 2 }
{ "status" : "Processing", "Count" : 1 }
...
I get very close with this Aggregate query, but it will grab each and every status and not just the the single most current Status for each file. (one current status per file).
db.myDB.aggregate([
{
"$match" : {
"uploads.attachmentStatus": {
"$elemMatch": { "status": { "$exists": true } }
}
}
},
{ $unwind: "$uploads"},
{ $unwind: "$uploads.attachmentStatus"},
{
$sortByCount: "$uploads.attachmentStatus.status"
},
{
$project: {
_id:0,
status: "$_id",
Count: "$count"
}
}
]).pretty();
Any suggestions?
Demo - https://mongoplayground.net/p/zzOR9qhqny0
{ $sort: { "uploads.attachmentStatus.date": -1 } },
to get the latest 1st{ $group: { _id: "$uploads._id", status: { $first: "$uploads.attachmentStatus.status" } } }
Group the records by uploads._id and take the top status (which is the latest status after the sort by date).Query
Complete query