Hardik Kanajariya

Posted on Oct 24, 2025

Day 6: Aggregation Functions and GROUP BY

#postgres #beginners #learning #tutorial

Day 6: Aggregation Functions and GROUP BY

Welcome Back! 🚀

Today we'll explore one of PostgreSQL's most powerful features: aggregation functions and grouping data. These tools are essential for data analysis and reporting.

What You'll Learn Today

Understanding Aggregation Functions
Using GROUP BY clause
HAVING clause for filtering groups
Common aggregation patterns
Multi-level grouping

1. Understanding Aggregation Functions

Aggregation functions perform calculations on multiple rows and return a single result.

Common Aggregation Functions

-- COUNT: Count rows
SELECT COUNT(*) FROM employees;
SELECT COUNT(email) FROM employees; -- Ignores NULL values
SELECT COUNT(DISTINCT department) FROM employees;

-- SUM: Calculate total
SELECT SUM(salary) FROM employees;
SELECT SUM(salary) AS total_payroll FROM employees;

-- AVG: Calculate average
SELECT AVG(salary) FROM employees;
SELECT ROUND(AVG(salary), 2) AS average_salary FROM employees;

-- MIN and MAX: Find minimum and maximum
SELECT MIN(salary) AS lowest_salary FROM employees;
SELECT MAX(salary) AS highest_salary FROM employees;

-- Multiple aggregations in one query
SELECT 
    COUNT(*) AS total_employees,
    AVG(salary) AS avg_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary,
    SUM(salary) AS total_payroll
FROM employees;

2. GROUP BY Clause

The GROUP BY clause divides rows into groups and applies aggregation functions to each group.

Basic GROUP BY

-- Count employees by department
SELECT 
    department,
    COUNT(*) AS employee_count
FROM employees
GROUP BY department;

-- Average salary by department
SELECT 
    department,
    ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;

-- Total sales by product
SELECT 
    product_id,
    COUNT(*) AS order_count,
    SUM(quantity) AS total_quantity,
    SUM(total_amount) AS total_revenue
FROM orders
GROUP BY product_id;

Multiple Column Grouping

-- Group by multiple columns
SELECT 
    department,
    job_title,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department, job_title
ORDER BY department, job_title;

-- Sales by year and month
SELECT 
    EXTRACT(YEAR FROM order_date) AS year,
    EXTRACT(MONTH FROM order_date) AS month,
    COUNT(*) AS order_count,
    SUM(total_amount) AS revenue
FROM orders
GROUP BY EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date)
ORDER BY year, month;

3. HAVING Clause

The HAVING clause filters groups after aggregation (while WHERE filters rows before aggregation).

-- Departments with more than 5 employees
SELECT 
    department,
    COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 5
ORDER BY employee_count DESC;

-- Departments with average salary above $60,000
SELECT 
    department,
    AVG(salary) AS avg_salary,
    COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING AVG(salary) > 60000
ORDER BY avg_salary DESC;

-- Combining WHERE and HAVING
SELECT 
    department,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary
FROM employees
WHERE hire_date >= '2020-01-01'  -- Filter rows first
GROUP BY department
HAVING COUNT(*) >= 3              -- Then filter groups
ORDER BY avg_salary DESC;

4. Common Aggregation Patterns

Pattern 1: Finding Top N Groups

-- Top 5 departments by total payroll
SELECT 
    department,
    SUM(salary) AS total_payroll,
    COUNT(*) AS employee_count
FROM employees
GROUP BY department
ORDER BY total_payroll DESC
LIMIT 5;

Pattern 2: Percentage Calculations

-- Department salary as percentage of total
SELECT 
    department,
    SUM(salary) AS dept_total,
    ROUND(
        (SUM(salary) * 100.0 / (SELECT SUM(salary) FROM employees)), 
        2
    ) AS percentage_of_total
FROM employees
GROUP BY department
ORDER BY dept_total DESC;

Pattern 3: Grouping with Conditional Aggregation

-- Count employees by salary range
SELECT 
    department,
    COUNT(*) AS total_employees,
    COUNT(CASE WHEN salary < 50000 THEN 1 END) AS low_salary_count,
    COUNT(CASE WHEN salary BETWEEN 50000 AND 80000 THEN 1 END) AS mid_salary_count,
    COUNT(CASE WHEN salary > 80000 THEN 1 END) AS high_salary_count
FROM employees
GROUP BY department;

5. Advanced Grouping Techniques

ROLLUP - Hierarchical Grouping

-- Subtotals by department and grand total
SELECT 
    department,
    job_title,
    COUNT(*) AS employee_count,
    SUM(salary) AS total_salary
FROM employees
GROUP BY ROLLUP(department, job_title)
ORDER BY department NULLS LAST, job_title NULLS LAST;

GROUPING SETS

-- Multiple grouping combinations in one query
SELECT 
    department,
    job_title,
    COUNT(*) AS employee_count
FROM employees
GROUP BY GROUPING SETS (
    (department),
    (job_title),
    (department, job_title),
    ()
);

Real-World Example: Sales Analysis

-- Create sample sales table
CREATE TABLE sales (
    sale_id SERIAL PRIMARY KEY,
    product_name VARCHAR(100),
    category VARCHAR(50),
    sale_date DATE,
    quantity INTEGER,
    unit_price DECIMAL(10,2)
);

-- Comprehensive sales analysis
SELECT 
    category,
    COUNT(DISTINCT sale_id) AS total_orders,
    SUM(quantity) AS total_units_sold,
    ROUND(AVG(quantity), 2) AS avg_units_per_order,
    SUM(quantity * unit_price) AS total_revenue,
    ROUND(AVG(unit_price), 2) AS avg_unit_price,
    MIN(sale_date) AS first_sale,
    MAX(sale_date) AS last_sale
FROM sales
GROUP BY category
HAVING SUM(quantity * unit_price) > 1000
ORDER BY total_revenue DESC;

Important Notes

Column Selection Rule: Any column in SELECT that's not in an aggregation function must be in GROUP BY
WHERE vs HAVING:
- WHERE filters rows before grouping
- HAVING filters groups after aggregation
NULL Handling: Aggregation functions ignore NULL values (except COUNT(*))
Performance: Add indexes on columns used in GROUP BY for better performance

Common Mistakes to Avoid

-- ❌ WRONG: Column not in GROUP BY or aggregation
SELECT department, job_title, COUNT(*)
FROM employees
GROUP BY department;
-- Error: job_title must be in GROUP BY

-- ✅ CORRECT
SELECT department, COUNT(*)
FROM employees
GROUP BY department;

-- ❌ WRONG: Using WHERE with aggregation
SELECT department, COUNT(*)
FROM employees
WHERE COUNT(*) > 5
GROUP BY department;
-- Error: Use HAVING instead

-- ✅ CORRECT
SELECT department, COUNT(*)
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;

Today's Challenge! 🎯

Create a comprehensive employee analysis query that:

Groups employees by department
Shows count, average salary, min/max salary for each department
Only includes departments with more than 3 employees
Filters to show only departments where average salary > $55,000
Orders results by average salary descending

-- Your solution here:
-- Write a query that meets all the requirements above

Summary

Today you learned:

✅ Essential aggregation functions (COUNT, SUM, AVG, MIN, MAX)
✅ GROUP BY clause for data grouping
✅ HAVING clause for filtering aggregated results
✅ Advanced grouping with ROLLUP and GROUPING SETS
✅ Real-world data analysis patterns

Coming Up Next!

Day 7: Subqueries and Nested Queries - Learn how to write queries within queries for complex data retrieval! 🔍

Questions? Drop them in the comments! Let's learn together! 💬

PostgreSQL #Database #SQL #DataAnalysis #15DaysOfPostgreSQL

DEV Community

Day 6: Aggregation Functions and GROUP BY

Day 6: Aggregation Functions and GROUP BY

Welcome Back! 🚀

What You'll Learn Today

1. Understanding Aggregation Functions

Common Aggregation Functions

2. GROUP BY Clause

Basic GROUP BY

Multiple Column Grouping

3. HAVING Clause

4. Common Aggregation Patterns

Pattern 1: Finding Top N Groups

Pattern 2: Percentage Calculations

Pattern 3: Grouping with Conditional Aggregation

5. Advanced Grouping Techniques

ROLLUP - Hierarchical Grouping

GROUPING SETS

Real-World Example: Sales Analysis

Important Notes

Common Mistakes to Avoid

Today's Challenge! 🎯

Summary

Coming Up Next!

PostgreSQL #Database #SQL #DataAnalysis #15DaysOfPostgreSQL

Top comments (0)