Easy Tutorial
❮ Mongodb Autoincrement Sequence Mongodb Relationships ❯

MongoDB Map Reduce

Map-Reduce is a computational model that, in simple terms, breaks down a large amount of work (data) into smaller tasks (MAP) and then combines the results into a final outcome (REDUCE).

MongoDB's Map-Reduce is highly flexible and quite practical for large-scale data analysis.

MapReduce Command

The following is the basic syntax for MapReduce:

>db.collection.mapReduce(
   function() {emit(key,value);},  //map function
   function(key,values) {return reduceFunction},   //reduce function
   {
      out: collection,
      query: document,
      sort: document,
      limit: number
   }
)

To use MapReduce, you need to implement two functions: the Map function and the Reduce function. The Map function calls emit(key, value) and iterates through all records in the collection, passing the key and value to the Reduce function for processing.

The Map function must call emit(key, value) to return key-value pairs.

Parameter Explanation:

The following example finds data with status:"A" in the orders collection, groups them by cust_id, and calculates the total amount.


Using MapReduce

Consider the following document structure storing user posts, with documents containing the user's user_name and the post's status field:

>db.posts.insert({
   "post_text": "tutorialpro.org, the most comprehensive technical documentation.",
   "user_name": "mark",
   "status":"active"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "tutorialpro.org, the most comprehensive technical documentation.",
   "user_name": "mark",
   "status":"active"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "tutorialpro.org, the most comprehensive technical documentation.",
   "user_name": "mark",
   "status":"active"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "tutorialpro.org, the most comprehensive technical documentation.",
   "user_name": "mark",
   "status":"active"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "tutorialpro.org, the most comprehensive technical documentation.",
   "user_name": "mark",
   "status":"disabled"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "tutorialpro.org, the most comprehensive technical documentation.",
   "user_name": "tutorialpro",
   "status":"disabled"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "tutorialpro.org, the most comprehensive technical documentation.",
   "user_name": "tutorialpro",
   "status":"disabled"
})
WriteResult({ "nInserted" : 1 })
>db.posts.insert({
   "post_text": "tutorialpro.org, the most comprehensive technical documentation.",
   "user_name": "tutorialpro",
   "status": "active"
})
WriteResult({ "nInserted" : 1 })

Now, we will use the mapReduce function in the posts collection to select published articles (status: "active") and group them by user_name to count the number of articles per user:

>db.posts.mapReduce( 
   function() { emit(this.user_name, 1); }, 
   function(key, values) { return Array.sum(values); }, 
      {  
         query: { status: "active" },  
         out: "post_total" 
      }
)

The above mapReduce output is:

{
        "result": "post_total",
        "timeMillis": 23,
        "counts": {
                "input": 5,
                "emit": 5,
                "reduce": 1,
                "output": 2
        },
        "ok": 1
}

The results indicate that there are 5 documents that meet the query condition (status: "active"), 5 key-value pair documents were generated in the map function, and finally, the reduce function grouped the same keys into 2 groups.

Specific parameter explanations:

To view the mapReduce query results using the find operator:

> var map = function() { emit(this.user_name, 1); }
> var reduce = function(key, values) { return Array.sum(values); }
> var options = { query: { status: "active" }, out: "post_total" }
> db.posts.mapReduce(map, reduce, options)
{ "result": "post_total", "ok": 1 }
> db.post_total.find();

The above query displays the following results:

{ "_id": "mark", "value": 4 }
{ "_id": "tutorialpro", "value": 1 }

In a similar manner, MapReduce can be used to build large and complex aggregation queries.

The Map and Reduce functions can be implemented using JavaScript, making MapReduce very flexible and powerful. ```

❮ Mongodb Autoincrement Sequence Mongodb Relationships ❯