<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" >

<channel><title><![CDATA[Asya's Collection of Random Stuff - Stupid Tricks with MongoDB]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb]]></link><description><![CDATA[Stupid Tricks with MongoDB]]></description><pubDate>Sat, 13 Sep 2025 01:36:10 -0700</pubDate><generator>Weebly</generator><item><title><![CDATA[Comparing Subdocuments In MongoDB Expressions]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/comparing-subdocuments-in-mongodb-expressions]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/comparing-subdocuments-in-mongodb-expressions#comments]]></comments><pubDate>Thu, 16 Jan 2020 19:01:35 GMT</pubDate><category><![CDATA[Aggregation]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/comparing-subdocuments-in-mongodb-expressions</guid><description><![CDATA[A coworker recently asked how to efficiently determine that two subdocuments are "equal".&nbsp; The issue of course is that in normal MongoDB query language semantics, if you just say:&nbsp;{$eq: [ {"a":1, "b":2},  {"b":2, "a":1} ] }the result is false because the two subdocuments are not "equal".&nbsp; So how do you determine if two subdocuments are logically equal (without regard to field order) or whether all subdocuments in the collection or a group are logically equal?&nbsp; &nbsp;The chall [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">A coworker recently asked how to efficiently determine that two subdocuments are "equal".&nbsp; The issue of course is that in normal MongoDB query language semantics, if you just say:&nbsp;</div><div><div id="529021758919182184" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>{$eq: [ {"a":1, "b":2},  {"b":2, "a":1} ] }</pre></div></div><div class="paragraph">the result is false because the two subdocuments are not "equal".&nbsp; So how do you determine if two subdocuments are logically equal (without regard to field order) or whether all subdocuments in the collection or a group are logically equal?&nbsp; &nbsp;<br><br>The challenge the coworker was working on was comparing index definitions across multiple shards.&nbsp; There are several parts to the index definition.&nbsp; The key part very much depends on the order of the fields - an index on {"a":1, "b":1} is not the same thing as the index on {"b":1, "a":1}.&nbsp; &nbsp;<br><br>However the options on the index are not order dependent, if the specification part of the index is {"unique":true, "sparse":true} it has exactly the same effect as if it's {"sparse":true, "unique":true}.<br><br>Here are a couple of functions to the rescue.&nbsp; The first one does a comparison of two objects and considers them equal if they have the same top level fields with the same values.&nbsp; &nbsp;The second one will "normalize" an object so that no matter what order the fields are in, they will be in alphabetical order in the result document.</div><div><div id="243608038350220532" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>var unorderedEq = function(o1, o2) {    return {$eq: [       {$arrayToObject:{$setUnion:[{$objectToArray:o1}]}},       {$arrayToObject:{$setUnion:[{$objectToArray:o2}]}}    ] };};var normalize = function(o) {    return {"$arrayToObject" : {"$setUnion" : [ {"$objectToArray" : o}]}}}</pre></div></div><div class="paragraph">Check out lots of other useful functions in my github repo here:&nbsp;<a href="https://github.com/asya999/bits-n-pieces/tree/master/scripts">https://github.com/asya999/bits-n-pieces/tree/master/scripts</a>&nbsp;and try them out!</div>]]></content:encoded></item><item><title><![CDATA[How to Convert Epoch to ISODate]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/how-to-convert-epoch-to-isodate]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/how-to-convert-epoch-to-isodate#comments]]></comments><pubDate>Thu, 31 Oct 2019 21:16:22 GMT</pubDate><category><![CDATA[Aggregation]]></category><category><![CDATA[Dates]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/how-to-convert-epoch-to-isodate</guid><description><![CDATA[The great thing about recent versions of MongoDB is that they added a lot of new expressions for handling different types in aggregation.&nbsp; The not so great thing is sometimes you still have to get things done in an older version that doesn't have the same capabilities.A colleague asked me today how you can convert an epoch (number of milliseconds since 1970/1/1) to a proper ISODate format.&nbsp; The answer comes from date math:db.coll.aggregate({$addFields:{&nbsp; &nbsp; &nbsp;&nbsp;date:{$ [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">The great thing about recent versions of MongoDB is that they added a lot of new expressions for handling different types in aggregation.&nbsp; The not so great thing is sometimes you still have to get things done in an older version that doesn't have the same capabilities.<br /><br />A colleague asked me today how you can convert an epoch (number of milliseconds since 1970/1/1) to a proper ISODate format.&nbsp; The answer comes from date math:<br /><br />db.coll.aggregate({$addFields:{<br />&nbsp; &nbsp; &nbsp;&nbsp;date:{$add:[<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;ISODate("1970-01-01T00:00:00Z"), <br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;"$epoch"<br />&nbsp; &nbsp; &nbsp;&nbsp;]}<br />}})<br /><br />Because $add supports adding numbers to date (treating the number as milliseconds) we get back a proper ISODate that the number in "epoch" field represents!<br /></div>]]></content:encoded></item><item><title><![CDATA[How to get k-combinations from array in aggregation]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/how-to-get-k-combinations-from-array-in-aggregation]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/how-to-get-k-combinations-from-array-in-aggregation#comments]]></comments><pubDate>Wed, 11 Sep 2019 17:09:34 GMT</pubDate><category><![CDATA[Aggregation]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/how-to-get-k-combinations-from-array-in-aggregation</guid><description><![CDATA[A colleague asked me if it's possible to generate all combinations of 2 items for a given array using aggregation pipeline expression.&nbsp; In other words, something that gives this:&gt; db.combos.find(){_id:1, a:[ 1,2,3]}&gt; db.combos.aggregate({$project:{_id:0, pairs: &lt;generate-all-pairs&gt;}}){pairs: [  [1,2], [1,3], [2,3] ]}I'm always game to challenge aggregation so here is what I came up with for the &lt;generate&gt; expression for k=2:{$reduce:{   input:{$range:[0,{$size:"$a"}]},     [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">A colleague asked me if it's possible to generate <a href="https://en.wikipedia.org/wiki/Combination" target="_blank">all combinations</a> of 2 items for a given array using aggregation pipeline expression.&nbsp; In other words, something that gives this:<br></div><div><div id="746190419309493222" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>&gt; db.combos.find(){_id:1, a:[ 1,2,3]}&gt; db.combos.aggregate({$project:{_id:0, pairs: &lt;generate-all-pairs&gt;}}){pairs: [  [1,2], [1,3], [2,3] ]}</pre></div></div><div class="paragraph"><span style="color:rgb(0, 51, 0)">I'm always game to challenge aggregation so here is what I came up with for the &lt;generate&gt; expression for k=2:</span></div><div><div id="503937947309753537" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>{$reduce:{   input:{$range:[0,{$size:"$a"}]},    initialValue:[],    in:{$concatArrays:[       "$$value",       {$let:{         vars:{i:"$$this"},         in:{$map:{            input:{$range:[{$add:[1,"$$i"]},{$size:"$a"}]},            in:[ {$arrayElemAt:["$a","$$i"]}, {$arrayElemAt:["$a","$$this"]}] }}      }}   ]}}}</pre></div></div><div class="paragraph">This is the aggregation equivalent of looping over the array elements and for each one looping over the remaining array elements to create the pairs.&nbsp; If array is `a` it's:</div><div><div id="678425512788329831" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>pairs = [];for (i=0; i&lt;a.length; i++) {      for (j=a[i+1]; j&lt;a.length; j++) {            pairs.push([ [a[i], a[j] ]);      }}</pre></div></div><div class="paragraph">Aggregation expressions are pretty powerful.&nbsp; I gave a talk about the power of expressions over arrays at MongoDB World and local events in 2017/2018:&nbsp; if you missed it watch it <a href="https://www.mongodb.com/presentations/pipeline-power-doing-more-with-mongodb-aggregation-asya-kamsky" target="_blank">HERE</a>.</div>]]></content:encoded></item><item><title><![CDATA[More Truncation of Dates in Aggregation]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/more-truncation-of-dates-in-aggregation]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/more-truncation-of-dates-in-aggregation#comments]]></comments><pubDate>Tue, 23 Jan 2018 22:13:57 GMT</pubDate><category><![CDATA[Aggregation]]></category><category><![CDATA[Dates]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/more-truncation-of-dates-in-aggregation</guid><description><![CDATA[&nbsp;A long time ago, I wrote about how to convert an ISODate() that's got time into just a date (meaning zeroing out the hours/minutes/seconds/milliseconds).&nbsp; &nbsp;We did it by subtracting from the date the number of milliseconds since midnight which we calculated using some date math and $hour, $minute, $second and $millisecond expressions.As of 3.6 there's a slightly simpler way to achieve the same thing using the "$dateFromParts" expression.​Here's an example:&gt; db.dates.find({},{ [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">&nbsp;A long time ago, <a href="http://www.kamsky.org/stupid-tricks-with-mongodb/stupid-date-tricks-with-aggregation-framework" target="_blank">I wrote about how to convert an ISODate()</a> that's got time into just a date (meaning zeroing out the hours/minutes/seconds/milliseconds).&nbsp; &nbsp;We did it by subtracting from the date the number of milliseconds since midnight which we calculated using some date math and $hour, $minute, $second and $millisecond expressions.<br><br>As of 3.6 there's a slightly simpler way to achieve the same thing using the <a href="https://docs.mongodb.com/manual/reference/operator/aggregation/dateFromParts/" target="_blank">"$dateFromParts" expression</a>.<br><br>&#8203;Here's an example:<br></div><div><div id="757887501134247471" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>&gt; db.dates.find({},{_id:0}){ "sys_created_on" : ISODate("2012-02-18T03:04:49Z") }{ "sys_created_on" : ISODate("2012-02-18T03:04:49Z") }{ "sys_created_on" : ISODate("2012-02-18T03:04:49Z") }{ "sys_created_on" : ISODate("2012-02-10T03:04:49Z") }{ "sys_created_on" : ISODate("2012-02-28T03:04:49Z") }{ "sys_created_on" : ISODate("2012-03-18T03:04:49Z") }&gt; db.dates.aggregate({$project:{_id:0, roundDate:{$dateFromParts:{     year:{$year:"$sys_created_on"},     month:{$month:"$sys_created_on"},     day:{$dayOfMonth:"$sys_created_on"}}}}}){ "roundDate" : ISODate("2012-02-18T00:00:00Z") }{ "roundDate" : ISODate("2012-02-18T00:00:00Z") }{ "roundDate" : ISODate("2012-02-18T00:00:00Z") }{ "roundDate" : ISODate("2012-02-10T00:00:00Z") }{ "roundDate" : ISODate("2012-02-28T00:00:00Z") }{ "roundDate" : ISODate("2012-03-18T00:00:00Z") }</pre></div></div>]]></content:encoded></item><item><title><![CDATA[Aggregation Helper Functions: lpad]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/aggregation-helper-functions-lpad]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/aggregation-helper-functions-lpad#comments]]></comments><pubDate>Thu, 07 Dec 2017 17:54:48 GMT</pubDate><category><![CDATA[Aggregation]]></category><category><![CDATA[Dates]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/aggregation-helper-functions-lpad</guid><description><![CDATA[MongoDB aggregation provides quite a few string manipulation functions, but there are many that it doesn't provide (yet), but we can express them ourselves using existing string expressions.Today's example is 'lpad' - given a string, desired length and a character to pad with, return a string that is at least that length and if it was shorter then pad it on the front (i.e. left side) with provided pad character (by default we will use space to pad with).lpad = function (str, len, padstr=" ") {   [...] ]]></description><content:encoded><![CDATA[<div class="paragraph"><span style="color:rgb(0, 51, 0)">MongoDB aggregation provides quite a few string manipulation functions, but there are many that it doesn't provide (yet), but we can express them ourselves using existing string expressions.</span><br><br><span style="color:rgb(0, 51, 0)">Today's example is 'lpad' - given a string, desired length and a character to pad with, return a string that is at least that length and if it was shorter then pad it on the front (i.e. left side) with provided pad character (by default we will use space to pad with).</span></div><div><div id="588548896445686013" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>lpad = function (str, len, padstr=" ") {      var redExpr={$reduce:{        input:{$range:[0,{$subtract:[len, {$strLenCP:str}]}]},        initialValue:"",        in:{$concat:["$$value",padstr]}}};      return {$cond:{        if:{$gte:[{$strLenCP:str},len]},        then:str,        else:{$concat:[ redExpr, str]}      }};}</pre></div></div><div class="paragraph"><span style="color:rgb(0, 51, 0)">To test the function, let's look at converting one date format to another:</span></div><div><div id="115670978399038735" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>db.d2s.aggregate({$project:{_id:0}}){ "d" : "1/4/2017" }{ "d" : "1/14/2017" }{ "d" : "11/8/2017" }{ "d" : "09/6/2017" }</pre></div></div><div class="paragraph"><span style="color:rgb(0, 51, 0)">To convert to "YYYY-MM-DD" we can use&nbsp;</span><a href="https://docs.mongodb.com/manual/reference/operator/aggregation/split/index.html" target="_blank">$split</a><span style="color:rgb(0, 51, 0)">&nbsp;and&nbsp;</span><a href="https://docs.mongodb.com/manual/reference/operator/aggregation/arrayElemAt/index.html" target="_blank">$arrayElemAt</a><span style="color:rgb(0, 51, 0)">&nbsp;expressions:</span></div><div><div id="746419768225999355" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>db.d2s.aggregate({$project:{_id:0, dt:{$let:{    vars:{parts:{$split:["$d","/"]}},    in:{$concat:[        {$arrayElemAt:["$$parts",2]},'-',        {$arrayElemAt:["$$parts",0]} ,'-',        {$arrayElemAt:["$$parts",1]}    ]}}}}}){ "dt" : "2017-1-4" }{ "dt" : "2017-1-14" }{ "dt" : "2017-11-8" }{ "dt" : "2017-09-6" }</pre></div></div><div class="paragraph"><span style="color:rgb(0, 51, 0)">But to get it to look right, we want to pad single digit days and months with '0' and we can use our function for this:</span></div><div><div id="509937735171250603" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>db.d2s.aggregate({$project:{_id:0, dt:{$let:{    vars:{parts:{$split:["$d","/"]}},    in:{$concat:[        {$arrayElemAt:["$$parts",2]},'-',        lpad({$arrayElemAt:["$$parts",0]},2,"0") ,'-',        lpad({$arrayElemAt:["$$parts",1]},2,"0")    ]}}}}}){ "dt" : "2017-01-04" }{ "dt" : "2017-01-14" }{ "dt" : "2017-11-08" }{ "dt" : "2017-09-06" }</pre></div></div><div class="paragraph"><span style="color:rgb(0, 51, 0)">Great news is that in 3.6 (</span><a href="https://www.mongodb.com/blog/post/announcing-mongodb-36" target="_blank">out earlier this week</a><span style="color:rgb(0, 51, 0)">!) you can take advantage of some great new&nbsp;</span><a href="https://docs.mongodb.com/master/release-notes/3.6/#new-aggregation-operators" target="_blank">date expressions</a><span style="color:rgb(0, 51, 0)">&nbsp;to avoid all this extra work like this:</span><br></div><div><div id="897844182231607500" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>db.d2s.aggregate({$project:{_id:0, dt:{    $dateToString:{        format:'%Y-%m-%d',        date:{$dateFromString:{dateString:"$d"}}    }}}}){ "dt" : "2017-01-04" }{ "dt" : "2017-01-14" }{ "dt" : "2017-11-08" }{ "dt" : "2017-09-06" }</pre></div></div><div class="paragraph"><a href="https://docs.mongodb.com/manual/reference/operator/aggregation/dateFromString/" target="_blank">$dateFromString</a> is new and while it does not take a format specifier, it can handle just about every format of date string I tried to throw at it!<br><br>Luckily we still can use <font color="#2A2A2A">lpad</font> helper when we want to line up string columns.</div>]]></content:encoded></item><item><title><![CDATA[Rank and Dense Rank in Aggregation]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/rank-and-dense-rank-in-aggregation]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/rank-and-dense-rank-in-aggregation#comments]]></comments><pubDate>Wed, 19 Jul 2017 19:28:41 GMT</pubDate><category><![CDATA[Aggregation]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/rank-and-dense-rank-in-aggregation</guid><description><![CDATA[Earlier today someone asked me if it was possible to do dense ranking using aggregation framework. &nbsp;If you need a reminder of rank vs dense rank (I did) rank is the one that ranks sequentially making ties have the same rank but then skipping the rank that would have been used if there was no tie. &nbsp;So if the values we have are [ 100, 96, 96, 25, 25, 1 ] then ranks would be [ 1, 2, 2, 4, 4, 6]. &nbsp; Dense rank will also &nbsp;give ties the same rank, but it doesn't skip any "position"  [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">Earlier today someone asked me if it was possible to do dense ranking using aggregation framework. &nbsp;If you need a reminder of rank vs dense rank (I did) rank is the one that ranks sequentially making ties have the same rank but then skipping the rank that would have been used if there was no tie. &nbsp;So if the values we have are [ 100, 96, 96, 25, 25, 1 ] then ranks would be [ 1, 2, 2, 4, 4, 6]. &nbsp; Dense rank will also &nbsp;give ties the same rank, but it doesn't skip any "position" so the dense ranks for the same set would be: [ 1, 2, 2, 3, 3, 4 ].<br><br>Since ranking is done within specific "grouping", you're probably not surprised that $group stage is going to be involved. &nbsp; To get the scores (whatever you want to rank by) in order you can use $sort on that field first (assuming you have an index to support it) or you can $group with $push first and then sort the array in each document. &nbsp;Then you need to do ranking. &nbsp; Because it's a pretty complex expression, I created a helper function that generates it based on appropriate inputs:<br></div><div><div id="370925262893806958" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"></div></div><div class="paragraph">That's it! &nbsp;Now that I have that function, I can pass in my array of objects, specifying which field is being used for ranking, and whether or not I want dense ranking or regular ranking.</div><div><div id="398919666427672915" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"></div></div><div class="paragraph">The most sharp-eyed of you may have noticed that I must be running this in the latest 3.5 development release, because I used the "$mergeObjects" expression in my rankArray function. &nbsp;You can simulate the functionality of "$mergeObjects" by using $objectToArray, $concatArrays and $arrayToObject to do exactly the same thing in 3.4.4 or later, or if you are on an earlier version, instead of $mergeObjects of "$$this" and {"rank": "$$rank"} you can write the new object yourself explicitly:<br>&nbsp;&nbsp; &nbsp; {<br>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;"emp": "$$this.emp",<br>&nbsp; &nbsp; &nbsp; &nbsp; "sal" &nbsp; : "$$this.sal",<br>&nbsp; &nbsp; &nbsp; &nbsp; "rank": "$$rank"<br>&nbsp; &nbsp; &nbsp;}<br>&#8203;<br>&#8203;$mergeObjects is only one of many great enhancements coming to MongoDB 3.6.&nbsp;<br></div>]]></content:encoded></item><item><title><![CDATA[How to Match a Strict Subset of an Array in Order]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/how-to-match-a-strict-subset-of-an-array-in-order]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/how-to-match-a-strict-subset-of-an-array-in-order#comments]]></comments><pubDate>Thu, 29 Jun 2017 16:44:27 GMT</pubDate><category><![CDATA[Aggregation]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/how-to-match-a-strict-subset-of-an-array-in-order</guid><description><![CDATA[While reviewing an old jira case for MongoDB that asked for a way to query for a strict subset of an array, I realized this can very easily be done in aggregation. &nbsp; Since I've been talking a lot recently about the power of aggregation (and MongoDB schema) lying in being able to query things stored in arrays, I thought I'd write up this example here.The simple example will use a simple array of scalars representing "actions" like the example the ticket.db.test.find({},{}){ "_id" : 1, "actio [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">While reviewing <a href="https://jira.mongodb.org/browse/SERVER-737" target="_blank" title="">an old jira case for MongoDB</a> that asked for a way to query for a strict subset of an array, I realized this can very easily be done in aggregation. &nbsp; Since <a href="https://www.mongodb.com/world17/sessions/94721#search=&amp;page=1&amp;track=&amp;level=" target="_blank">I've been talking</a> a lot recently about the power of aggregation (and MongoDB schema) lying in being able to query things stored in arrays, I thought I'd write up this example here.</div><div class="paragraph">The simple example will use a simple array of scalars representing "actions" like the example the ticket.</div><div><div id="779303844775502044" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>db.test.find({},{}){ "_id" : 1, "actions" : [ 2, 6, 3, 8, 5, 3 ] }{ "_id" : 2, "actions" : [ 6, 4, 2, 8, 4, 3 ] }{ "_id" : 3, "actions" : [ 6, 4, 6, 4, 3 ] }{ "_id" : 4, "actions" : [ 6, 8, 3 ] }{ "_id" : 5, "actions" : [ 6, 8 ] }{ "_id" : 6, "actions" : [ 6, 3, 11, 8, 3 ] }{ "_id" : 7, "actions" : [ 6, 3, 8 ] }</pre></div></div><div class="paragraph">We want to find only the documents which contain actions [6, 3, 8] and in exactly this order with no intervening actions.</div><div><div id="249930450940677502" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>let wantedActions = [6, 3, 8];db.test.aggregate([  {$match:{actions:{$all:wantedActions}}},])</pre></div></div><div class="paragraph">Note that first we match to reduce the documents we will be processing only to the ones that contain all of the actions we are interested in (but in any order). &nbsp;<br><br>Next we create an array of indexes which will let us step through the actions array creating a new array of all three element sub-arrays. &nbsp;At the end of the first two stages, our results are:<br></div><div><div id="758428500371267878" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>db.test.aggregate([    {$match:{actions:{$all:[6,3,8]}}},    {$project:{actions638:{$map:{       input:{$range:[0,{$subtract:[{$size:"$actions"},2]}]},       in:{$slice:["$actions","$$this",3]}    }}}}]){ "_id" : 1, "actions638" : [ [ 2, 6, 3 ], [ 6, 3, 8 ], [ 3, 8, 5 ], [ 8, 5, 3 ] ] }{ "_id" : 2, "actions638" : [ [ 6, 4, 2 ], [ 4, 2, 8 ], [ 2, 8, 4 ], [ 8, 4, 3 ] ] }{ "_id" : 4, "actions638" : [ [ 6, 8, 3 ] ] }{ "_id" : 6, "actions638" : [ [ 6, 3, 11 ], [ 3, 11, 8 ], [ 11, 8, 3 ] ] }{ "_id" : 7, "actions638" : [ [ 6, 3, 8 ] ] }</pre></div></div><div class="paragraph">Now it's easy to add another $match stage to get just the documents we want:</div><div><div id="279419469802371515" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>db.test.aggregate([  {$match:{actions:{$all:wantedActions}}},  {$project:{actions638:{$map:{        input:{$range:[0,{$subtract:[{$size:"$actions"},2]}]},        in:{$slice:["$actions","$$this",3]}  }}}},  {$match:{actions638:wantedActions}}]){ "_id" : 1, "actions638" : [ [ 2, 6, 3 ], [ 6, 3, 8 ], [ 3, 8, 5 ], [ 8, 5, 3 ] ] }{ "_id" : 7, "actions638" : [ [ 6, 3, 8 ] ] }</pre></div></div><div class="paragraph">If the action is an object inside an array, note that we can perform necessary transformations on it during the $map stage - rather than outputting subarray of original elements, we can extract only a single element from the subobjects.<br><br>What if we care about finding all actions "in order" but they don't have to be in strict sequence - that is, other actions are allowed in between, as long as the order of the actions we are looking for is correct?<br><br>&#8203;The simplest way to achieve that <span style="color:rgb(0, 51, 0)">(out of many)&nbsp;</span>&#8203;would be to add a $filter expression to remove all actions which are not in our wantedActions list and then proceed with exact same processing we've already seen:</div><div><div id="692308484785813439" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>db.test.aggregate([   {$match:{actions:{$all:wantedActions}}},   {$project:{actions638:{       $let: {          vars: {ouractions:{$filter:{input:"$actions",cond:{$in:["$$this", wantedActions]}}}},          in: {$map:{               input:{$range:[0,{$subtract:[{$size:"$$ouractions"},2]}]},               in:{$slice:["$$ouractions","$$this",3]}          }}      }   }}},   {$match:{actions638:wantedActions}}]){ "_id" : 1, "actions638" : [ [ 6, 3, 8 ], [ 3, 8, 3 ] ] }{ "_id" : 6, "actions638" : [ [ 6, 3, 8 ], [ 3, 8, 3 ] ] }{ "_id" : 7, "actions638" : [ [ 6, 3, 8 ] ] }</pre></div></div>]]></content:encoded></item><item><title><![CDATA[Converting ObjectId Values to year-month labels v2]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/converting-objectid-values-to-year-month-labels-v2]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/converting-objectid-values-to-year-month-labels-v2#comments]]></comments><pubDate>Mon, 05 Jun 2017 17:38:27 GMT</pubDate><category><![CDATA[Aggregation]]></category><category><![CDATA[Dates]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/converting-objectid-values-to-year-month-labels-v2</guid><description><![CDATA[A long time ago I wrote a blog post showing how to convert ObjectId value field to corresponding "YYYY-MM" string for reporting type applications. &nbsp;Since I wrote that, aggregation pipeline gained the "$switch" expression which makes the syntax a lot shorter and easier to express (and read).For variety, this version converts *string* type that represents ObjectId value to corresponding year-month:d=[];o=[];for (yr=2011;  yr &lt; 2018; yr++ ) {       for (m=1; m&lt;13; m++) {           if (m& [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">A long time ago I wrote a <a href="http://www.kamsky.org/stupid-tricks-with-mongodb/converting-objectid-to-dates-in-aggregation" target="_blank">blog post</a> showing how to convert ObjectId value field to corresponding "YYYY-MM" string for reporting type applications. &nbsp;Since I wrote that, aggregation pipeline gained the <a href="https://docs.mongodb.com/manual/reference/operator/aggregation/switch/" target="_blank">"$switch" expression</a> which makes the syntax a lot shorter and easier to express (and read).<br><br>For variety, this version converts *string* type that represents ObjectId value to corresponding year-month:</div><div><div id="119599827366707701" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>d=[];o=[];for (yr=2011;  yr &lt; 2018; yr++ ) {       for (m=1; m&lt;13; m++) {           if (m&lt;10) mo="0"+m; else mo=""+m;           var dt=new ISODate(""+yr+"-"+mo+"-01T00:00:00Z");           d.push(""+yr+"-"+mo);           /* wrap string in 'new ObjectId()' to convert OID rather than string type */           o.push(""+(dt.getTime()/1000).toString(16)+pad);      }  }makeLabeledSwitch = function(field, keys, values) {      var sw = {$switch:{           "branches":[ ],           default:"other"}      };      var br=[];      var maxPos=keys.length;      var first="&lt;" + keys[0];      br.push({case:{$lt:[field,values[0]]}, then:first})      for (pos = 0; pos &lt; maxPos-1; pos++) {           br.push({case:{$lt:[field,values[pos+1]]}, then: keys[pos] });      }      var last="&gt;" + keys[maxPos-1];      sw["$switch"]["default"] = last;      sw["$switch"]["branches"] = br;      return sw;}</pre></div></div><div class="paragraph">This syntax is more straight forward, and what it does is quite similar, which is for every "YYYY-MM" string in the range you're interested in, it maps the range of ObjectId string values (or technically, its first 4-bytes) to the year-month range. &nbsp; If you want to make this work with actual ObjectId type rather than string type, change the loop to populate "o" array with ObjectId() of corresponding string.</div>]]></content:encoded></item><item><title><![CDATA[How to do intra-array comparisons]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/how-to-do-intra-array-comparisons]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/how-to-do-intra-array-comparisons#comments]]></comments><pubDate>Fri, 05 May 2017 20:14:14 GMT</pubDate><category><![CDATA[Aggregation]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/how-to-do-intra-array-comparisons</guid><description><![CDATA[A colleague asked me how to find documents where the array of objects called "trans" has the following properties: one element contains a:0 and it's immediately followed by an element where a:1 and s&gt;3.In other words, flag the document that has trans array with element { ..., a:0, ...} immediately followed by element with { ..., a:1, s: N, ... } where N is greater than 3 or one that looks like this:&nbsp;{&nbsp; &nbsp; ...&nbsp; &nbsp; "trans" : [..., {...}, {..., a:0, ...}, {.., a:1, s:4, .. [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">A colleague asked me how to find documents where the array of objects called "trans" has the following properties: one element contains a:0 and it's immediately followed by an element where a:1 and s&gt;3.<br><br>In other words, flag the document that has trans array with element { ..., a:0, ...} immediately followed by element with { ..., a:1, s: N, ... } where N is greater than 3 or one that looks like this:&nbsp;<br><br><font>{</font><br><font>&nbsp; &nbsp; ...</font><br><font>&nbsp; &nbsp; "trans" : [..., {...}, {..., a:0, ...}, {.., a:1, s:4, ...}, {...}, ...],</font><br><font>&nbsp; &nbsp; ...</font><br><font>}</font><br><br>Here's the aggregation stage that adds a true or false field that indicates whether such a pattern was found in the "trans" array:</div><div><div id="790855186635647437" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>{$addFields:     {bad:          {$in:[true,              {$map:{                  input: {$range:[0,{$subtract:[{$size:"$trans"},1]} ]},                  as: "z",                  in: {$let: {                       vars: {                          e: {$arrayElemAt:["$trans","$$z"]},                          e1: {$arrayElemAt:["$trans",{$add:[1,"$$z"]}]}                      },                      in: {$cond: {                          if: {$and:[ {$eq:["$$e.a",0] },{$eq:["$$e1.a",1]}, {$gt:["$$e1.s",3]} ]},                         then: true,                         else: false                      } }                  }}              }}         ]}    }}</pre></div></div><div class="paragraph">To elaborate: using <a href="https://docs.mongodb.com/manual/reference/operator/aggregation/range/" target="_blank">"$range"</a> we generate "z", an array of integers we'll <a href="https://docs.mongodb.com/manual/reference/operator/aggregation/map/" target="_blank">$map</a> to traverse "trans" and then we create two variables with <a href="https://docs.mongodb.com/manual/reference/operator/aggregation/let/" target="_blank">"$let"</a> which represent the array element at position "z" and at position "z+1". &nbsp;We then check our conditions and if all of them are true, we output "true" otherwise "false". &nbsp;The resultant array of booleans is checked using <a href="https://docs.mongodb.com/manual/reference/operator/aggregation/in/" target="_blank">"$in" expression</a> to see if "true" appears anywhere in it.<br><br>This stage uses several 3.4 features, the <a href="https://docs.mongodb.com/manual/reference/operator/aggregation/addFields/" target="_blank">"$addFields" stage</a>, as well as the $range and $in expressions. &nbsp; We could have used <a href="https://docs.mongodb.com/manual/reference/operator/aggregation/anyElementTrue/" target="_blank">"$anyElementTrue"</a> expression instead of "$in" and "$project" instead of "$addFields" (though then we would need to know all the fields we wanted to pass through) but there is no equivalent to "$range" before 3.4, so without it, we would need to do far more complex manipulation involving <a href="https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/" target="_blank">"$unwind" with "includeArrayIndex"</a> option (which was introduced in 3.2) followed by "$group". &nbsp;If at all possible, just upgrade to 3.4 if you need to do intra-array comparisons.</div>]]></content:encoded></item><item><title><![CDATA[Using 3.4 Aggregation Enhancements for Parallel Array Processing]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/using-34-aggregation-enhancements-for-parallel-array-processing]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/using-34-aggregation-enhancements-for-parallel-array-processing#comments]]></comments><pubDate>Wed, 30 Nov 2016 20:20:26 GMT</pubDate><category><![CDATA[Aggregation]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/using-34-aggregation-enhancements-for-parallel-array-processing</guid><description><![CDATA[Now that 3.4 is out, I thought I'd publish some example aggregations I've shown to various folks over the last few months as we were testing new features. &nbsp;One thing that I've seen people store in MongoDB documents are "parallel arrays" - when there are two arrays that are somehow correlated, the first element in each array are related, so are the second ones, etc.Here's a simple pipeline to add up each Nth element from each array:db.example.find(){ "_id" : ObjectId("583f35399bb2f9300fd1eff [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">Now that 3.4 is out, I thought I'd publish some example aggregations I've shown to various folks over the last few months as we were testing new features. &nbsp;One thing that I've seen people store in MongoDB documents are "parallel arrays" - when there are two arrays that are somehow correlated, the first element in each array are related, so are the second ones, etc.<br /><br />Here's a simple pipeline to add up each Nth element from each array:<br /><br />db.example.find()<br />{ "_id" : ObjectId("583f35399bb2f9300fd1effe"), "a" : [ 1, 2, 3, 4, 5 ], "b" : [ 10, 20, 30, 40, 50 ] }<br />{ "_id" : ObjectId("583f355a9bb2f9300fd1efff"), "a" : [ 6, 7, 8 ], "b" : [ 600, 700, 800 ] }<br /><br />db.example.aggregate( [ { "$project" : {&nbsp;<br />&nbsp; &nbsp; &nbsp; "aPlusb" : { "$map" : {<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; "input" : { "$zip" :{ "inputs" :["$a","$b"]}},&nbsp;<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; "as" &nbsp; &nbsp; &nbsp;: "zipped",&nbsp;<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; "in" &nbsp; &nbsp; &nbsp; : { "$sum":"$$zipped"}<br />&nbsp; &nbsp; &nbsp; &nbsp;}}<br />}})<br />{ "_id" : ObjectId("583f35399bb2f9300fd1effe"), "aPlusb" : [ 11, 22, 33, 44, 55 ] }<br />{ "_id" : ObjectId("583f355a9bb2f9300fd1efff"), "aPlusb" : [ 606, 707, 808 ] }<br /><br />&#8203;This is possible thanks to the&nbsp;<a href="https://docs.mongodb.com/manual/reference/operator/aggregation/zip/" target="_blank">new operator "$zip"</a> which follows the <a href="https://docs.python.org/3/library/functions.html#zip" target="_blank">Python zip function</a> purpose and lets you combine multiple arrays into one.<br /><br />Is "$zip" only useful when you already have parallel arrays in your document? &nbsp; It turns out there are other cases you may want to keep it in mind. &nbsp;One situation may be when you have an array and you would like to "enumerate" each element's index or location in the array, but you don't want or need to "$unwind" the array first (in previous versions you could "$unwind" with "includeArrayIndex" option but then to recreate the original array with indexes you would have to do a "$group" which is likely to be very inefficient.)<br /><br />Here's a simple way to use <a href="https://docs.mongodb.com/manual/release-notes/3.4/#aggregation" target="_blank">new "$range" operator</a> combined with "$zip" to generate array indexes along with original array elements.<br /><br />db.example.find()<br />{ "_id" : ObjectId("583f37859bb2f9300fd1f000"), "a" : [ "first", "second", "third" ] }<br />{ "_id" : ObjectId("583f37949bb2f9300fd1f001"), "a" : [ "pizza", "sushi" ] }<br /><br />db.example.aggregate( [&nbsp;{&nbsp;"$project" : {<br />&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;"aWithIx" : {<br />&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;"$zip" : {<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;"inputs" : [ &nbsp;"$a", {&nbsp;"$range" : [ 0, {&nbsp;"$size" : "$a" }&nbsp;] } ]<br />&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;}<br />&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;}<br />&nbsp; }&nbsp;} ] &nbsp;)<br />{ "_id" : ObjectId("583f37859bb2f9300fd1f000"), "aWithIx" : [ [ "first", 0 ], [ "second", 1 ], [ "third", 2 ] ] }<br />{ "_id" : ObjectId("583f37949bb2f9300fd1f001"), "aWithIx" : [ [ "pizza", 0 ], [ "sushi", 1 ] ] }<br /><br />I'm sure you noticed that I made my range 0 based and I used size of each array "a" as the end value. &nbsp;The default "step" (optional third argument) is 1 so that works fine for this simple example.<br /><br />There are many other great new aggregation features in 3.4. &nbsp;In the future, I want to show examples with some of the <a href="https://docs.mongodb.com/manual/release-notes/3.4/#aggregation" target="_blank">new stages</a>: "$replaceRoot" and "$addFields", which allow you to manipulate the shape of your documents without having to know all the existing fields in them as well as "$facet" which allows you to run several "parallel" aggregations on the same input stream of documents.</div>]]></content:encoded></item><item><title><![CDATA[Using 3.4 Aggregation to Return Documents in Same Order as "$in" Expression]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/using-34-aggregation-to-return-documents-in-same-order-as-in-expression]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/using-34-aggregation-to-return-documents-in-same-order-as-in-expression#comments]]></comments><pubDate>Mon, 24 Oct 2016 19:57:24 GMT</pubDate><category><![CDATA[Aggregation]]></category><category><![CDATA[Schema]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/using-34-aggregation-to-return-documents-in-same-order-as-in-expression</guid><description><![CDATA[Sometimes when performing a MongoDB query with a long "$in" list, you might want to get return documents in the same order as the elements of the "$in" array are in. &nbsp; This request is Jira ticket SERVER-7528. &nbsp;Upcoming version 3.4 adds many cool new features, and some of the newly available aggregation stages and expressions make it pretty easy to do this.Example of our collection:{ "_id" : ObjectId("580e51fc87a0572ee623854f"), "name" : "Asya" }{ "_id" : ObjectId("580e520087a0572ee6238 [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">Sometimes when performing a MongoDB query with a long "$in" list, you might want to get return documents in the same order as the elements of the "$in" array are in. &nbsp; This request is Jira ticket <a href="https://jira.mongodb.org/browse/SERVER-7528" target="_blank">SERVER-7528</a>. &nbsp;Upcoming version 3.4 adds many cool new features, and some of the newly available aggregation stages and expressions make it pretty easy to do this.</div><div class="paragraph">Example of our collection:</div><div><div id="380930411745885519" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>{ "_id" : ObjectId("580e51fc87a0572ee623854f"), "name" : "Asya" }{ "_id" : ObjectId("580e520087a0572ee6238550"), "name" : "Charlie" }{ "_id" : ObjectId("580e520587a0572ee6238551"), "name" : "Tess" }{ "_id" : ObjectId("580e520887a0572ee6238552"), "name" : "David" }{ "_id" : ObjectId("580e520c87a0572ee6238553"), "name" : "Kyle" }{ "_id" : ObjectId("580e521287a0572ee6238554"), "name" : "Aly" }</pre></div></div><div class="paragraph">The query we want to run is one that will return all documents where name is one of "David", "Charlie" or "Tess" and we want them in that exact order.<br><br>&gt; db.people.find({"name":{"$in": ["David", "Charlie", "Tess"]}}).sort({ ??? })<br><br>Let's define a variable called "order" so we don't have to keep typing the names in the array:<br><br>&gt; order = [ "David", "Charlie", "Tess" ]<br><br>Here's how we can do this with aggregation framework:<br><br>&#8203;&nbsp; &nbsp; m = { "$match" : { "name" : { "$in" : order } } };<br>&nbsp; &nbsp; a = { "$addFields" : { "__order" : { "$indexOfArray" : [ order, "$name" ] } } };<br>&nbsp; &nbsp; s = { "$sort" : { "__order" : 1 } };<br>&nbsp; &nbsp; db.people.aggregate( [ m, a, s ] );<br><br>&#8203;Our result:<br>{ "_id" : ObjectId("580e520887a0572ee6238552"), "name" : "David", "__order" : 0 }<br>{ "_id" : ObjectId("580e520087a0572ee6238550"), "name" : "Charlie", "__order" : 1 }<br>{ "_id" : ObjectId("580e520587a0572ee6238551"), "name" : "Tess", "__order" : 2 }<br><br>The <a target="_blank" href="https://docs.mongodb.com/master/reference/operator/aggregation/addFields/#pipe._S_addFields">"$addFields"</a> stage is new in 3.4 and it allows you to "$project" new fields to existing documents without knowing all the other existing fields. &nbsp;The new &nbsp;<a target="_blank" href="https://docs.mongodb.com/master/reference/operator/aggregation/indexOfArray/#exp._S_indexOfArray">"$indexOfArray"</a> expression returns position of particular element in a given array.<br><br>The result of this aggregation will be documents that match your condition, in order specified in the input array "order", and the documents will include all original fields, plus an additional field called "__order". &nbsp;If we want to remove this field, 3.4 allows "$project" stage with just exclusion specification, so we would just add { "$project": {"__order":0}} at the end of our pipeline.<br><br>Lots of great new things coming in 3.4 - I'll post some more tricks soon.</div>]]></content:encoded></item><item><title><![CDATA[Converting ObjectId to dates in aggregation]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/converting-objectid-to-dates-in-aggregation]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/converting-objectid-to-dates-in-aggregation#comments]]></comments><pubDate>Wed, 17 Feb 2016 22:00:08 GMT</pubDate><category><![CDATA[Aggregation]]></category><category><![CDATA[Dates]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/converting-objectid-to-dates-in-aggregation</guid><description><![CDATA[Someone just asked me how they can do reporting grouping on "year-month" when they only have the ObjectId generated by MongoDB to represent creation date.While ObjectId is very useful - its first four bytes are the timestamp when it was generated - there's a simple way to convert it to a full date in Javascript (like mongo shell) but there is no way to convert it to a timestamp in aggregation pipeline (although there is a request for such a feature).Since we can't do it in aggregation natively,  [...] ]]></description><content:encoded><![CDATA[<div class="paragraph" style="text-align:left;">Someone just asked me how they can do reporting grouping on "year-month" when they only have the <a target="_blank" href="https://docs.mongodb.org/manual/reference/object-id/">ObjectId</a> generated by MongoDB to represent creation date.<br><br>While ObjectId is very useful - its first four bytes are the timestamp when it was generated - there's a simple way to convert it to a full date in Javascript (like mongo shell) but there is no way to convert it to a timestamp in aggregation pipeline (although there is a <a target="_blank" href="https://jira.mongodb.org/browse/SERVER-9406">request for such a feature</a>).<br><br>Since we can't do it in aggregation natively, we can use a stupid trick to generate "YEAR-MONTH" from ObjectId during <a target="_blank" href="https://docs.mongodb.org/manual/reference/operator/aggregation/project/">$project stage</a> so that we can <a target="_blank" href="https://docs.mongodb.org/manual/reference/operator/aggregation/group/#pipe._S_group">group by</a> it. &nbsp; Here is how I did it.<br><br>Working in the shell, first I generated an array of objects which represent all the months I want to report for (so I only generated a few years worth of months):</div><div><div id="156195694550666849" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>var d = []; var o = [];var pad="f000000000000000";for (yr=2014;  yr &lt; 2017; yr++ ) {      for (m=1; m&lt;13; m++) {         if (m&lt;10) mo="0"+m; else mo=""+m;           var dt=new ISODate(""+yr+"-"+mo+"-01T00:00:00Z");         d.push(""+yr+"-"+mo);         o.push(new ObjectId( (dt.getTime()/1000).toString(16)+pad));      }  }</pre></div></div><div class="paragraph" style="text-align:left;"><span>This generated two arrays of "YYYY-MM" strings and their corresponding ObjectId() values.</span><br><br><span>Now I can create a shell function which takes a field and two arrays, and creates an expression we can use in $project stage to map ranges in the second array to labels in the first array:</span></div><div><div id="193736815494979959" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>makeLabeledBuckets=function( field, keys, values) {    var con=[];    var maxPos=keys.length;    con[maxPos]="&gt;" + keys[maxPos-1];    for (pos = maxPos-1; pos &gt; 0; pos--) {        con[pos] = {"$cond":{                  if: {$lt:[field, values[pos]]},                  then:  keys[pos-1],                  else:  con[pos+1]        }};     }     var first = "&lt; " + keys[0];     con[0]={"$cond":{if: {$lt:[field,values[0]]}, then: first, else: con[1] }};     return con[0];}</pre></div></div><div class="paragraph" style="text-align:left;"><span>Now we can run our aggregation in the shell like this:</span></div><div><div id="629605675792369097" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"><pre>&gt; db.collection.aggregate( [   { $project: { yearMonthStr: makeLabeledBuckets("$_id", d, o) } } ] ){ "_id" : ObjectId("55af2194cd214aaa0a5e3545"), "yearMonth" : "2015-08" }{ "_id" : ObjectId("55af21b5cd214aaa0a5e3548"), "yearMonth" : "2015-08" }{ "_id" : ObjectId("55aff3f78909abe4721284bc"), "yearMonth" : "2015-08" }{ "_id" : ObjectId("55aff4bd8909abe4721284c0"), "yearMonth" : "2015-08" }{ "_id" : ObjectId("56900c440172f6f5768fb249"), "yearMonth" : "2016-02" }{ "_id" : ObjectId("56900d780172f6f5768fb24c"), "yearMonth" : "2016-02" }{ "_id" : ObjectId("56900dc80172f6f5768fb24e"), "yearMonth" : "2016-02" }{ "_id" : ObjectId("569014240172f6f5768fb251"), "yearMonth" : "2016-02" }</pre></div></div><div class="paragraph" style="text-align:left;"><span>As you can see, each ObjectId in "_id" field got converted to corresponding "year-month" string, which we can now use to aggregate other metrics by.</span></div>]]></content:encoded></item><item><title><![CDATA[Determining Type of Field in MongoDB Aggregation]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/determining-type-of-field-in-mongodb-aggregation]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/determining-type-of-field-in-mongodb-aggregation#comments]]></comments><pubDate>Wed, 15 Jul 2015 19:53:19 GMT</pubDate><category><![CDATA[Aggregation]]></category><category><![CDATA[Schema]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/determining-type-of-field-in-mongodb-aggregation</guid><description><![CDATA[It can be useful to determine what data type a particular field is during aggregation. &nbsp; It's most useful to determine whether something is an array or not - mainly so that you don't try to get its $size, for example, but it can be useful for other types -&nbsp;you may need to apply some sort of transformation, or conversion before the next stage.&nbsp;Aggregation does not (yet) have a $typeOf expression, otherwise you could just $project a new field with {"$typeOf" : "$field"} as its value [...] ]]></description><content:encoded><![CDATA[<div class="paragraph" style="text-align:left;">It can be useful to determine what data type a particular field is during aggregation. &nbsp; It's most useful to determine whether something is an array or not - mainly so that you don't try to get its $size, for example, but it can be useful for other types -&nbsp;you may need to apply some sort of transformation, or conversion before the next stage.&nbsp;<br><br>Aggregation does not (yet) have a $typeOf expression, otherwise you could just $project a new field with {"$typeOf" : "$field"} as its value. &nbsp; So we have to be more tricky and starting with MongoDB 3.0 we can be, due to&nbsp;<a href="https://jira.mongodb.org/browse/SERVER-3304" style="font-size: 1em; line-height: 1.5; background-color: initial;" title="">SERVER-3304</a>&nbsp;which ensures consistent total ordering across all different types.<br><br>The <a href="http://docs.mongodb.org/manual/reference/bson-types/#comparison-sort-order" target="_blank" title="">ordering is documented</a> and you can always double-check <a href="https://github.com/mongodb/mongo/blob/master/src/mongo/bson/bsontypes.h#L124-L168" target="_blank" title="">the source code</a>&nbsp;just to be sure.<br><br>Let's look at an example collection:<br></div><div><div id="496468579567895816" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"></div></div><div class="paragraph" style="text-align:left;">How can we create an aggregation to output the type of each "f1" value? &nbsp; One thing that will help in the future will be the $isArray operator coming in 3.2 (now available as part of development version 3.1.5), but we don't really need it here. &nbsp;Knowing the total ordering across all types, we can figure out each type by comparing the value to lowest possible value of that data type, ordering the comparisons in such a way as to always get at most one type.<br><br>Because we don't want to write out long "if then else" type conditions, let's generate them in the mongo shell with a function:</div><div><div id="233625878842985424" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"></div></div><div class="paragraph" style="text-align:left;">When we include a call to this function with a string representing value of a field, it generates a very long string of "$cond" "if:, then:, else:" tests, which gives us a type. &nbsp;Let's now include it in our aggregation call and check out the results:<br></div><div><div id="134182702636575534" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"></div></div><div class="paragraph" style="text-align:left;">The $project stage passed through _id (default) and "f1" plus we added a new field "typeF1" which was equal to the long and nested conditional generated by getTypes js function we just created. &nbsp;&nbsp;<br><br>Until <a href="https://jira.mongodb.org/browse/SERVER-13447" target="_blank">SERVER-13447</a> gives us an operator/expression to get the same value simply, this will work just fine.</div>]]></content:encoded></item><item><title><![CDATA[Troubles with Stepdown?  No problem...]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/troubles-with-stepdown-no-problem]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/troubles-with-stepdown-no-problem#comments]]></comments><pubDate>Mon, 23 Jun 2014 15:35:43 GMT</pubDate><category><![CDATA[Operations]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/troubles-with-stepdown-no-problem</guid><description><![CDATA[Here is a stupid MongoDB Trick for someone who accidentally tells their primary to step down and not be eligible for election for some number of seconds - and that number is higher than they intended.How do you cut that time short?Normally when you run:rs.stepDown(120)&nbsp;your primary will step down (relinquish its Primary role) and it won't be eligible to be re-elected for two minutes. &nbsp;What if you realize that you didn't really mean to do this, or whatever you meant took about 5 seconds [...] ]]></description><content:encoded><![CDATA[<div class="paragraph" style="text-align:left;">Here is a stupid MongoDB Trick for someone who accidentally tells their primary to step down and not be eligible for election for some number of seconds - and that number is higher than they intended.<br /><br />How do you cut that time short?<br /><br />Normally when you run:<br /><br /><font color="#2a2a2a">rs.stepDown(120)&nbsp;</font><br /><br />your primary will step down (relinquish its Primary role) and it won't be eligible to be re-elected for two minutes. &nbsp;What if you realize that you didn't really mean to do this, or whatever you meant took about 5 seconds instead of 120? &nbsp; &nbsp;<br /><br />You can't do another <a href="http://docs.mongodb.org/manual/reference/method/rs.stepDown/" target="_blank">rs.stepDown</a> with a different time value, because it will rightfully give you an error saying it cannot step down, not being a primary.<br /><br />But what you can do is use the <a href="http://docs.mongodb.org/manual/reference/method/rs.freeze/" target="_blank">rs.freeze()</a> command - this would normally be run on a secondary to prevent it from being eligible for the election for some number of seconds, but it has a special treatment for being passed 0 seconds:<br /><br /><font color="#2a2a2a">rs.freeze(0)</font><br /><font color="#2a2a2a">{ "info" : "unfreezing", "ok" : 1 }</font><br /><br />Well, isn't that convenient!</div>]]></content:encoded></item><item><title><![CDATA[New "Little MongoDB Book" Updated for 2.6.]]></title><link><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/new-little-mongodb-book-updated-for-26]]></link><comments><![CDATA[http://www.kamsky.org/stupid-tricks-with-mongodb/new-little-mongodb-book-updated-for-26#comments]]></comments><pubDate>Sat, 31 May 2014 18:51:25 GMT</pubDate><category><![CDATA[Schema]]></category><guid isPermaLink="false">http://www.kamsky.org/stupid-tricks-with-mongodb/new-little-mongodb-book-updated-for-26</guid><description><![CDATA[ More than three years ago when MongoDB was newer (1.8) and not as well known as it is today, &nbsp;Karl Seguin wrote a free ebook called "The Little MongoDB Book".  I read it about two years ago - it was only slightly out of date - and I really enjoyed the high level introduction it gives people. &nbsp; While it may be addressing developers, architects or managers, and doesn't have &nbsp;as much for DBAs, it was still a great place to get a quick intro.  MongoDB (the company) frequently hands o [...] ]]></description><content:encoded><![CDATA[<div class="paragraph" style="text-align:left;"> More than three years ago when MongoDB was newer (1.8) and not as well known as it is today, &nbsp;<a href="https://twitter.com/karlseguin" target="_blank" title="">Karl Seguin</a> <a href="http://openmymind.net/2011/3/28/The-Little-MongoDB-Book/" target="_blank" title="">wrote a free ebook</a> called <a href="https://github.com/karlseguin/the-little-mongodb-book" target="_blank" title="">"The Little MongoDB Book"</a>.<br> <br> I read it about two years ago - it was only slightly out of date - and I really enjoyed the high level introduction it gives people. &nbsp; While it may be addressing developers, architects or managers, and doesn't have &nbsp;as much for DBAs, it was still a great place to get a quick intro.<br> <br> MongoDB (the company) frequently hands out printed copies of this book at Meetups and MongoDB days. &nbsp; Over the last two years I was saddened that while still being a good intro, the technical details were getting out of date.<br> <br> The great thing about all things open is that you can fork a github repo, make updates and then create a pull request, which is just a fancy way of saying you can make updates yourself and then ask the owner to include them.<br> <br> And that's what I did. &nbsp; And yesterday, Karl announced the newly updated book is available to all. &nbsp;&nbsp;<br> </div>  <div> <div id="864569459463668232" align="left" style="width: 100%; overflow-y: hidden;" class="wcustomhtml"> <blockquote class="twitter-tweet" lang="en"> <p>The Little MongoDB Book has finally been updated thanks to <a href="https://twitter.com/asya999">@asya999</a>  <a href="http://t.co/KJNxA3Iks7">http://t.co/KJNxA3Iks7</a>  <a href="http://t.co/vUf96rbbtp">http://t.co/vUf96rbbtp</a></p>&mdash; karlseguin (@karlseguin) <a href="https://twitter.com/karlseguin/statuses/472457044092387328">May 30, 2014</a> </blockquote> </div> </div>  <div class="paragraph" style="text-align:left;"> Enjoy!<br> </div> ]]></content:encoded></item></channel></rss>