Yesterday I wrote about aggregating events over time in MongoDB, and it was pointed out to me that the sequence of events that my example code finds are ones where the order is strictly linear - in other words the actor cannot skip from action1 to action3 so I didn't have to worry about detecting sequences of events that went something like 'a1', 'a3', 'a2', 'a3' since first 'a3' would come before 'a2'. There were other sequences that my aggregation framework code wouldn't catch - and mostly because initially it was code that specifically was tailored for certain types of events that could only progress in specific order.
But what if we want to see if there is an arbitrary sequence anywhere in a chain of actions that goes 'a1', 'a2', 'a3' regardless of what is between them? Here's an updated pipeline that finds exactly that.
We'll assume the same collection we had yesterday, and the same action array ["a1", "a2", "a3"] and we will generate our pipeline programmatically, though simplifying it slightly and not worrying for the moment about counting number of occurrences of each action or time taken between actions, since we saw how to do that yesterday. Instead we'll make sure our 1then2 and 1then2then3 counts are correct regardless of possible sequence of events.
But what if we want to see if there is an arbitrary sequence anywhere in a chain of actions that goes 'a1', 'a2', 'a3' regardless of what is between them? Here's an updated pipeline that finds exactly that.
We'll assume the same collection we had yesterday, and the same action array ["a1", "a2", "a3"] and we will generate our pipeline programmatically, though simplifying it slightly and not worrying for the moment about counting number of occurrences of each action or time taken between actions, since we saw how to do that yesterday. Instead we'll make sure our 1then2 and 1then2then3 counts are correct regardless of possible sequence of events.
var never=ISODate("1970-01-01")
var ever=ISODate("2035-01-01")
var match={ "$match" : { "action_id" : { "$in" : actions } } };
var projectActions= { "$project" : { "p" : "$p_id" } };
var groupByPerson={ "$group" : { "_id" : "$p" } };
actions.forEach( function(act) {
projectActions["$project"][act] = { };
projectActions["$project"][act]["ts"] = { "$cond" : [ { "$eq" : [ "$action_id", act ] }, "$ts", ever ] };
var first = act + "first";
var last = act + "last";
var all = act + "all";
groupByPerson["$group"][first] = { "$min" : "$" + act + ".ts" };
groupByPerson["$group"][last] = { "$max" : "$" + act + ".ts" };
groupByPerson["$group"][all] = { "$push" : "$" + act + ".ts"};
});