Easy Tutorial
❮ Mongodb Regular Expression Mongodb Remove ❯

MongoDB Aggregation

Aggregation in MongoDB is primarily used for processing data (such as calculating averages, sums, etc.) and returning the computed results.

It is somewhat similar to the count(*) statement in SQL.


aggregate() Method

The aggregation method in MongoDB uses aggregate().

Syntax

The basic syntax format for the aggregate() method is as follows:

>db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)

Example

The data in the collection is as follows:

{
   _id: ObjectId(7df78ad8902c)
   title: 'MongoDB Overview', 
   description: 'MongoDB is no sql database',
   by_user: 'tutorialpro.org',
   url: 'http://www.tutorialpro.org',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 100
},
{
   _id: ObjectId(7df78ad8902d)
   title: 'NoSQL Overview', 
   description: 'No sql database is very fast',
   by_user: 'tutorialpro.org',
   url: 'http://www.tutorialpro.org',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 10
},
{
   _id: ObjectId(7df78ad8902e)
   title: 'Neo4j Overview', 
   description: 'Neo4j is no sql database',
   by_user: 'Neo4j',
   url: 'http://www.neo4j.com',
   tags: ['neo4j', 'database', 'NoSQL'],
   likes: 750
},

Now, we calculate the number of articles written by each author using aggregate(), with the result as follows:

> db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : 1}}}])
{
   "result" : [
      {
         "_id" : "tutorialpro.org",
         "num_tutorial" : 2
      },
      {
         "_id" : "Neo4j",
         "num_tutorial" : 1
      }
   ],
   "ok" : 1
}
>

The above example is similar to the SQL statement:

select by_user, count(*) from mycol group by by_user

In the example above, we grouped the data by the by_user field and calculated the sum of the by_user field with the same value.

The following table shows some aggregation expressions:

Expression Description Example
$sum Calculates the total sum. db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : "$likes"}}}])
$avg Calculates the average value. db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$avg : "$likes"}}}])
$min Gets the minimum value corresponding to all documents in the collection. db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$min : "$likes"}}}])
$max Gets the maximum value corresponding to all documents in the collection. db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$max : "$likes"}}}])
$push Adds values to an array without checking for duplicate values. db.mycol.aggregate([{$group : {_id : "$by_user", url : {$push: "$url"}}}])
$addToSet Adds a value to an array unless the value is already present, in which case it does nothing. db.mycol.aggregate([{$group : {_id : "$by_user", url : {$addToSet : "$url"}}}])
$first Gets the first document according to the sort order. db.mycol.aggregate([{$group : {_id : "$by_user", first_url : {$first : "$url"}}}])
$last Gets the last document according to the sort order. db.mycol.aggregate([{$group : {_id : "$by_user", last_url : {$last : "$url"}}}])

Concept of Pipelines

Pipelines in Unix and Linux are generally used to pass the output of the current command as the input to the next command.

MongoDB's aggregation pipeline passes the MongoDB documents through a series of pipeline stages that process the documents. Pipeline operations can be repeated.

Expressions: Process input documents and produce output. Expressions are stateless and only process the current aggregation pipeline's documents; they cannot process other documents.

Here we introduce a few commonly used operations in the aggregation framework:

Examples of Pipeline Operators

  1. $project Example
    db.article.aggregate(
        { $project : {
            title : 1 ,
            author : 1 ,
        }}
    );
    

This will result in documents containing only the _id, title, and author fields. By default, the _id field is included. To exclude the _id field:

db.article.aggregate(
    { $project : {
        _id : 0 ,
        title : 1 ,
        author : 1
    }}
);
  1. $match Example
    db.articles.aggregate( [
                            { $match : { score : { $gt : 70, $lte : 90 } } },
                            { $group: { _id: null, count: { $sum: 1 } } }
                           ] );
    

$match is used to retrieve records with a score greater than 70 and less than or equal to 90, then sends the matching records to the next stage, $group, for processing.

  1. $skip Example
    db.article.aggregate(
        { $skip : 5 });
    

After the $skip pipeline operator processes the documents, the first five documents are "filtered" out.

❮ Mongodb Regular Expression Mongodb Remove ❯