Note : I have provided only a few documents in the output to keep the post small but intuitive
The source collection :
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"Channel_Id" : 2,
"TweetId" : 15
},
"PostDate" : ISODate("2013-10-31T18:30:00Z")
}
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"Channel_Id" : 2,
"TweetId" : 16
},
"PostDate" : ISODate("2013-10-31T18:30:00Z")
}
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"Channel_Id" : 2,
"TweetId" : 17
},
"PostDate" : ISODate("2013-10-30T18:30:00Z")
}
Step-1 : Grouping by PostDate
Query :
db.Twitter_Processed.aggregate({$match : { "_id.SpId" : 840, "_id.Scheduler_Id" : 1 }},{$project:{SpId : "$_id.SpId",Scheduler_Id : "$_id.Scheduler_Id",day:{$dayOfMonth:'$PostDate'},month:{$month:'$PostDate'},year:{$year:'$PostDate'}, senti : "$Sentiment"}}, {$group : {_id : {SpId : "$SpId", Scheduler_Id : "$Scheduler_Id",day:'$day',month:'$month',year:'$year'}, sentiment : { $sum : "$senti"}}}, {$group : {_id : "$_id" , avgSentiment : {$avg : "$sentiment"}}})
Output :
{
"result" : [
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"day" : 31,
"month" : 10,
"year" : 2013
},
"avgSentiment" : 2.2700000000000005
},
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"day" : 30,
"month" : 10,
"year" : 2013
},
"avgSentiment" : 4.96
}
}
Step-2 : Attempting to achieve this :
{
"result" : [
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"Date" : ISODate("2013-10-31T18:30:00Z")
},
"avgSentiment" : 2.2700000000000005
},
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"Date" : ISODate("2013-10-31T18:30:00Z")
},
"avgSentiment" : 4.96
}
}
The query I attempted :
db.Twitter_Processed.aggregate({$match : { "_id.SpId" : 840, "_id.Scheduler_Id" : 1 }},{$project:{SpId : "$_id.SpId",Scheduler_Id : "$_id.Scheduler_Id",day:{$dayOfMonth:'$PostDate'},month:{$month:'$PostDate'},year:{$year:'$PostDate'}, senti : "$Sentiment"}}, {$group : {_id : {SpId : "$SpId", Scheduler_Id : "$Scheduler_Id",day:'$day',month:'$month',year:'$year'}, sentiment : { $sum : "$senti"}}}, {$group : {_id : "$_id" , avgSentiment : {$avg : "$sentiment"}}}, {$project : {_id : {SpId : "$_id.SpId",Scheduler_Id : "$_id.Scheduler_Id", date : new Date("$_id.year","$_id.month","$_id.day")}, avgSentiment : "$avgSentiment"}})
Output(error) :
Error: Printing Stack Trace
at printStackTrace (src/mongo/shell/utils.js:37:15)
at DBCollection.aggregate (src/mongo/shell/collection.js:897:9)
at (shell):1:22
Tue Dec 31 09:41:42.916 JavaScript execution failed: aggregate failed: {
"errmsg" : "exception: disallowed field type Date in object expression (
at 'date')",
"code" : 15992,
"ok" : 0
} at src/mongo/shell/collection.js:L898
How do I achieve Step-2 ?
As you've noticed, the Aggregation Framework (as at MongoDB 2.4) has operators to extract parts of dates but not to easily create date fields.
There's a great blog post on Stupid date tricks with Aggregation Framework that provides a creative workaround: truncate the date granularity using
$project
before you$group
: