Convert SQL GROUP BY with HAVING to MongoDB $match After $group

Learn how SQL HAVING clauses are converted to a MongoDB $match stage placed after $group in the aggregate pipeline for post-aggregation filtering.

Advanced

Detailed Explanation

HAVING to $match After $group

SQL's HAVING clause filters groups after aggregation, unlike WHERE which filters rows before. In MongoDB's aggregate pipeline, this translates to a $match stage placed after the $group stage.

Example SQL

SELECT category, COUNT(*) AS total
FROM products
GROUP BY category
HAVING COUNT(*) >= 5

Generated MongoDB Query

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      total: { $sum: 1 },
      category: { $first: "$category" }
    }
  },
  { $match: { total: { $gte: 5 } } }
])

Pipeline Order Matters

The order of stages in MongoDB's aggregate pipeline is critical:

  1. $match (WHERE) — filters documents before grouping
  2. $group — performs the aggregation
  3. $match (HAVING) — filters groups after aggregation
  4. $sort — orders the results
  5. $limit — limits output

WHERE + HAVING Combined

SELECT category, AVG(price) AS avg_price
FROM products
WHERE status = 'active'
GROUP BY category
HAVING AVG(price) > 25
db.products.aggregate([
  { $match: { status: "active" } },
  {
    $group: {
      _id: "$category",
      avg_price: { $avg: "$price" },
      category: { $first: "$category" }
    }
  },
  { $match: { avg_price: { $gt: 25 } } }
])

Notice the two $match stages at different positions — the first filters documents (WHERE), and the second filters groups (HAVING).

Referencing Aggregated Fields

In the HAVING $match stage, you reference the field names defined in the $group stage (like total or avg_price), not the original document fields. This is analogous to SQL where HAVING references alias names or aggregate expressions.

Performance Optimization

Always include a WHERE equivalent $match stage before $group when possible. This reduces the number of documents that need to be grouped, significantly improving performance for large collections.

Use Case

Business intelligence queries that identify high-volume categories, top customers by order count, or departments exceeding budget thresholds all rely on HAVING for post-aggregation filtering. This pattern is essential for building summary reports and alerting systems.

Try It — SQL to MongoDB Query

Open full tool