Aggregation
Aggregation operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result
Aggregation Pipeline
MongoDB's aggregation framework is modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated result. For example:
buildfire.publicData.aggregate({
pipelineStages: [
{ $match: { status: "A" } },
{ $group: { _id: "$cust_id", total: { $sum: "$amount" } } }
],
skip: 0,
limit: 20
}, 'articals', (err, result) => {
})
First Stage: The $match
stage filters the documents by the status field and passes to the next stage those documents that have status equal to "A".
Second Stage: The $group
stage groups the documents by the cust_id field to calculate the sum of the amount for each unique cust_id.
The most basic pipeline stages provide filters that operate like queries and document transformations that modify the form of the output document.
Other pipeline operations provide tools for grouping and sorting documents by specific field or fields as well as tools for aggregating the contents of arrays, including arrays of documents. In addition, pipeline stages can use operators for tasks such as calculating the average or concatenating a string.
Pipeline Stages
Pipeline stages appear in an array.
Documents pass through the stages in sequence.
All except the
$out
,$merge
, and$geoNear
stages can appear multiple times in a pipeline.Read more about pipeline stages and how to use them.
In the following some of basic pipeline stages:
$match
Filters the documents to pass only the documents that match the specified condition(s) to the next pipeline stage.
- Should be first stage of pipeline .
- Should have at least one of buildfire indexes.
- The
$match
stage has the following prototype form:{ $match: { <query> } }
- You cannot use
$where
in$match
queries as part of the aggregation pipeline. - You cannot use
$near
or$nearSphere
in$match
queries as part of the aggregation pipeline. As an alternative, you can either:- Use
$geoNear
stage instead of the $match stage. - Use
$geoWithin
query operator with$center
or$centerSphere
in the$match
stage.
- Use
- To use
$text
in the$match
stage, the$match
stage has to be the first stage of the pipeline.
$geoNear
Outputs documents in order of nearest to farthest from a specified point.
If it exist in the pipeline stages should be the first stage of it.
The $geoNear stage has the following prototype form:
{ $geoNear: { <geoNear options> } }
The $geoNear operator accepts a document that contains the following $geoNear options. Specify all distances in the same units as those of the processed documents' coordinate system:
near
,key
,query
,distanceField
,distanceMultiplier
,includeLocs
,maxDistance
,minDistance
,spherical
anduniqueDocs
.Important
key
option is required and it should have the right buildfire geospatial indexed as value_buildfire.geo
check Geospatia Data and check the example below.- key specify the geospatial indexed field to use when calculating the distance.
$group
Groups input documents by the specified _id expression and for each distinct grouping, outputs a document. The _id field of each output document contains the unique group by value. The output documents can also contain computed fields that hold the values of some accumulator expression.
The $group stage has the following prototype form:
{
$group:
{
_id: <expression>, // Group By Expression
<field1>: { <accumulator1> : <expression1> },
...
}
}_id: Required. If you specify an _id value of null, or any other constant value, the $group stage calculates accumulated values for all the input documents as a whole.
field: Optional. Computed using the accumulator operators.
Accumulator operators:
$accumulator
,$addToSet
,$avg
,$count
,$first
,$last
,$max
,$mergeObjects
,$min
,$stdDevPop
,$stdDevSamp
and$sum
.
$sort
Sorts all input documents and returns them to the pipeline in sorted order.
The
$sort
stage has the following prototype form:{ $sort: { <field1>: <sort order>, <field2>: <sort order> ... } }
$sort
takes a document that specifies the field(s) to sort by and the respective sort order.sort order
can have one of the following values:
$project
Passes along the documents with the requested fields to the next stage in the pipeline. The specified fields can be existing fields from the input documents or newly computed fields.
The $project stage has the following prototype form:
{ $project: { field1: 1 or 0, field2: 1 or 0, ...} }
The $project takes a document that can specify the inclusion of fields, the suppression of the _id field, the addition of new fields, and the resetting of the values of existing fields. Alternatively, you may specify the exclusion of fields.
The $project specifications have the following forms:
<field>: <1 or true>
: Specifies the inclusion of a field. Non-zero integers are also treated as true._id: <0 or false>
: Specifies the suppression of the _id field.To exclude a field conditionally, use the REMOVE variable instead. For details, see Exclude Fields Conditionally.<field>: <expression>
: Adds a new field or resets the value of an existing field.<field>:<0 or false>
: Specifies the exclusion of a field.
Example
A document in the places collection resembles the following:
{
"_id" : 1,
"data": {
"_buildfire": {
"geo": {
"type" : "Point",
"coordinates" : [ -73.9375, 30.843 ]
}
}
}
}
{
"_id" : 2,
"data": {
"_buildfire": {
"geo": {
"type" : "Point",
"coordinates" : [ -73.945, 40.8303 ]
}
}
}
}
{
"_id" : 3,
"data": {
"_buildfire": {
"geo": {
"type" : "Point",
"coordinates" : [ -73.9375, 40.8303 ]
}
}
}
}
buildfire.appData.aggregate(
{
pipelineStages: [
{
$geoNear: {
near: { type: "Point", coordinates: [ -73.98142 , 40.71782 ] },
key: "_buildfire.geo",
distanceField: "dist.calculated",
query: { }
}
}
],
skip: 0,
limit: 10,
},
"places",
(err, result) => {
if (err) return console.error("there was a problem aggregating your data");
console.log(" aggregation results: ", result);
}
);
The aggregation returns the following:
[
{
"_id" : 1,
"data": {
"_buildfire": {
"geo": {
"type" : "Point",
"coordinates" : [ -73.9375, 30.843 ]
}
}
}
},
{
"_id" : 3,
"data": {
"_buildfire": {
"geo": {
"type" : "Point",
"coordinates" : [ -73.9375, 40.8303 ]
}
}
}
},
{
"_id" : 2,
"data": {
"_buildfire": {
"geo": {
"type" : "Point",
"coordinates" : [ -73.945, 40.8303 ]
}
}
}
}
]
For Aggregation with MongoDB you can read more here