Day 5: Aggregation and Grouping
Date: January 23, 2025
Today, we focus on Aggregation and Grouping, two essential features of SQL that allow us to extract insightful summaries from large datasets. These concepts help us analyze data effectively, calculate metrics, and generate reports that are often required in real-world applications.
Concepts
1. Aggregate Functions
Aggregate functions perform calculations on a set of values and return a single value. Common aggregate functions include:
- COUNT: Counts the number of rows.
- SUM: Adds up the values in a column.
- AVG: Calculates the average of values in a column.
- MAX: Finds the maximum value.
- MIN: Finds the minimum value.
Examples of Aggregate Functions:
SELECT COUNT(*) FROM users; -- Count the total number of users.
SELECT SUM(total) FROM orders; -- Calculate the total sales.
SELECT AVG(total) FROM orders; -- Find the average order value.
SELECT MAX(total) FROM orders; -- Get the highest order value.
SELECT MIN(total) FROM orders; -- Get the lowest order value.
2. Grouping Data with GROUP BY
The GROUP BY clause organizes data into groups based on one or more columns, allowing us to apply aggregate functions to each group.
Syntax:
SELECT column, aggregate_function(column)
FROM table
GROUP BY column;
Example:
Group orders by user and calculate the total sales per user:
SELECT user_id, SUM(total) AS total_sales
FROM orders
GROUP BY user_id;
3. Filtering Groups with HAVING
The HAVING clause filters groups based on conditions. It is similar to the WHERE clause but is applied after grouping.
Syntax:
SELECT column, aggregate_function(column)
FROM table
GROUP BY column
HAVING condition;
Example:
Find users with total sales greater than 500:
SELECT user_id, SUM(total) AS total_sales
FROM orders
GROUP BY user_id
HAVING SUM(total) > 500;
Practice
Scenario: Users and Orders Tables
users table:
| id | name | email |
|----|-------|--------------------|
| 1 | Alice | alice@example.com |
| 2 | Bob | bob@example.com |
| 3 | Carol | carol@example.com |orders table:
| order_id | user_id | total | product_category |
|----------|---------|-------|------------------|
| 101 | 1 | 250 | Electronics |
| 102 | 2 | 400 | Furniture |
| 103 | 1 | 300 | Electronics |
| 104 | 3 | 150 | Clothing |
Practice Queries
1. Calculate Total Sales
SELECT SUM(total) AS total_sales
FROM orders;
Result:
| total_sales |
|-------------|
| 1100 |
2. Calculate Average Order Value
SELECT AVG(total) AS avg_order_value
FROM orders;
Result:
| avg_order_value |
|-----------------|
| 275 |
3. Group Orders by Product Category
SELECT product_category, SUM(total) AS category_sales
FROM orders
GROUP BY product_category;
Result:
| product_category | category_sales |
|------------------|----------------|
| Electronics | 550 |
| Furniture | 400 |
| Clothing | 150 |
4. Find the Highest-Selling Category
SELECT product_category, SUM(total) AS category_sales
FROM orders
GROUP BY product_category
ORDER BY category_sales DESC
LIMIT 1;
Result:
| product_category | category_sales |
|------------------|----------------|
| Electronics | 550 |
5. Show Total Sales Per User with HAVING Clause
SELECT user_id, SUM(total) AS total_sales
FROM orders
GROUP BY user_id
HAVING SUM(total) > 300;
Result:
| user_id | total_sales |
|---------|-------------|
| 1 | 550 |
| 2 | 400 |
Real-World Example
Scenario: Show Key Statistics for Your Project
- Total Number of Users:
SELECT COUNT(*) AS total_users
FROM users;
Result:
| total_users |
|-------------|
| 3 |
- Most Ordered Product Category:
SELECT product_category, SUM(total) AS total_sales
FROM orders
GROUP BY product_category
ORDER BY total_sales DESC
LIMIT 1;
Result:
| product_category | total_sales |
|------------------|-------------|
| Electronics | 550 |
- Average Order Value by Category:
SELECT product_category, AVG(total) AS avg_order_value
FROM orders
GROUP BY product_category;
Result:
| product_category | avg_order_value |
|------------------|-----------------|
| Electronics | 275 |
| Furniture | 400 |
| Clothing | 150 |
Interview Preparation
- What is the difference between WHERE and HAVING?
- Write a query to find the second highest-selling product category.
- How does GROUP BY handle NULL values?
- Explain the use of aggregate functions in reporting.
Outcome for the Day
By the end of Day 5, you should:
- Master aggregate functions and their applications.
- Group and summarize data using GROUP BY and HAVING.
- Write queries to analyze and visualize key metrics for your project.
In Day 6, weβll explore Database Relationships and Constraints, building a robust foundation for efficient and reliable database designs.
Would you like to explore visualization tools to represent these SQL results? Let me know!
Top comments (0)