I have scheduled tasks(parent) which start in the evening and runs an amount of jobs(children) which can be successful or not. I want to know the success rate for each executed task
Each document of collection jobs_per_task_per_day should contain
{
"_id" : {
"task" : ObjectId("52b31753d925d5b057000001"),
"year" : 2013,
"month" : 12,
"day" : 22
},
"successful" : 12,
"unsuccessful" : 0,
"task" : ObjectId("52b31753d925d5b057000001")
}
However, in my current aggregation script, I group per date. This gives an inaccuracy since the execution of an task can endure the whole night. a job started at 01-02-2014 02:00 is part of the task started at 01-01-2014. not 01-02-2014.
If the job ran after midnight but before 12:00 afternoon the next day, it should be counted to the document of a day earlier. The task itself does not have a date, since it is executed repeatedly with cron.
A Job:
{
"_id" : ObjectId("52b3751dd925d507db013f0e"),
"started_at" : ISODate("2013-12-19T22:37:17.900Z"),
"successful" : true,
"task_id" : ObjectId("52b31f8bd925d5b057000005")
}
A scheduled task:
{
"_id" : ObjectId("52b31f8bd925d5b057000005"),
"schedule" : {
"_id" : ObjectId("52b31784d925d5b057000002"),
"cron" : "30 21 * * *",
"name" : "Foo Import",
"worker_class" : "TaskWorker"
}
}
My current script:
var scheduled_tasks = db.tasks.find({
schedule: {
$exists: true
}
}).toArray()
scheduled_tasks.forEach(function(task) {
var results_total = db.jobs.aggregate({
$match: {
task_id: task._id
}
},
{
$group: {
_id: {
year: {
$year: "$started_at"
},
month: {
$month: "$started_at"
},
day: {
$dayOfMonth: "$started_at"
},
task: "$task_id"
},
successful: {
$sum: {
$cond: ["$successful", 0, 1]
}
},
unsuccessful: {
$sum: {
$cond: ["$successful", 1, 0]
}
},
task: {
$first: "$task_id"
},
}
}, {
$project: {
successful: 1,
unsuccessful: 1,
task: 1
}
}, {
$sort: {
date: -1
}
}
);
db.jobs_per_task_per_day.insert(results_total.result)
});
to group on timespan instead of date i tried to change it to:
$group: {
_id: {
task: "$task_id",
year: { $year: "$started_at" },
month: { $month: "$started_at" },
day: {
$cond: [
{$lt: [{$hour: "$started_at"}, 12]},
{$dayOfMonth: "$started_at"} -1,
{$dayOfMonth: "$started_at"}
]
}
}
However this did not return the desired result, and i'm doubting if it should be in the group operator.
Any help on this would really be appreciated!