DEV Community

Cover image for Aggregation and Grouping
arjun
arjun

Posted on

Aggregation and Grouping

Day 5: Aggregation and Grouping

Date: January 23, 2025

Today, we focus on Aggregation and Grouping, two essential features of SQL that allow us to extract insightful summaries from large datasets. These concepts help us analyze data effectively, calculate metrics, and generate reports that are often required in real-world applications.


Concepts

1. Aggregate Functions

Aggregate functions perform calculations on a set of values and return a single value. Common aggregate functions include:

  • COUNT: Counts the number of rows.
  • SUM: Adds up the values in a column.
  • AVG: Calculates the average of values in a column.
  • MAX: Finds the maximum value.
  • MIN: Finds the minimum value.

Examples of Aggregate Functions:

SELECT COUNT(*) FROM users; -- Count the total number of users.
SELECT SUM(total) FROM orders; -- Calculate the total sales.
SELECT AVG(total) FROM orders; -- Find the average order value.
SELECT MAX(total) FROM orders; -- Get the highest order value.
SELECT MIN(total) FROM orders; -- Get the lowest order value.
Enter fullscreen mode Exit fullscreen mode

2. Grouping Data with GROUP BY

The GROUP BY clause organizes data into groups based on one or more columns, allowing us to apply aggregate functions to each group.

Syntax:

SELECT column, aggregate_function(column)
FROM table
GROUP BY column;
Enter fullscreen mode Exit fullscreen mode

Example:

Group orders by user and calculate the total sales per user:

SELECT user_id, SUM(total) AS total_sales
FROM orders
GROUP BY user_id;
Enter fullscreen mode Exit fullscreen mode

3. Filtering Groups with HAVING

The HAVING clause filters groups based on conditions. It is similar to the WHERE clause but is applied after grouping.

Syntax:

SELECT column, aggregate_function(column)
FROM table
GROUP BY column
HAVING condition;
Enter fullscreen mode Exit fullscreen mode

Example:

Find users with total sales greater than 500:

SELECT user_id, SUM(total) AS total_sales
FROM orders
GROUP BY user_id
HAVING SUM(total) > 500;
Enter fullscreen mode Exit fullscreen mode

Practice

Scenario: Users and Orders Tables

  • users table:

    | id | name | email |

    |----|-------|--------------------|

    | 1 | Alice | alice@example.com |

    | 2 | Bob | bob@example.com |

    | 3 | Carol | carol@example.com |

  • orders table:

    | order_id | user_id | total | product_category |

    |----------|---------|-------|------------------|

    | 101 | 1 | 250 | Electronics |

    | 102 | 2 | 400 | Furniture |

    | 103 | 1 | 300 | Electronics |

    | 104 | 3 | 150 | Clothing |


Practice Queries

1. Calculate Total Sales

SELECT SUM(total) AS total_sales 
FROM orders;
Enter fullscreen mode Exit fullscreen mode

Result:

| total_sales |

|-------------|

| 1100 |

2. Calculate Average Order Value

SELECT AVG(total) AS avg_order_value 
FROM orders;
Enter fullscreen mode Exit fullscreen mode

Result:

| avg_order_value |

|-----------------|

| 275 |

3. Group Orders by Product Category

SELECT product_category, SUM(total) AS category_sales 
FROM orders
GROUP BY product_category;
Enter fullscreen mode Exit fullscreen mode

Result:

| product_category | category_sales |

|------------------|----------------|

| Electronics | 550 |

| Furniture | 400 |

| Clothing | 150 |

4. Find the Highest-Selling Category

SELECT product_category, SUM(total) AS category_sales 
FROM orders
GROUP BY product_category
ORDER BY category_sales DESC
LIMIT 1;
Enter fullscreen mode Exit fullscreen mode

Result:

| product_category | category_sales |

|------------------|----------------|

| Electronics | 550 |

5. Show Total Sales Per User with HAVING Clause

SELECT user_id, SUM(total) AS total_sales 
FROM orders
GROUP BY user_id
HAVING SUM(total) > 300;
Enter fullscreen mode Exit fullscreen mode

Result:

| user_id | total_sales |

|---------|-------------|

| 1 | 550 |

| 2 | 400 |


Real-World Example

Scenario: Show Key Statistics for Your Project

  1. Total Number of Users:
   SELECT COUNT(*) AS total_users 
   FROM users;
Enter fullscreen mode Exit fullscreen mode

Result:

| total_users |

|-------------|

| 3 |

  1. Most Ordered Product Category:
   SELECT product_category, SUM(total) AS total_sales 
   FROM orders
   GROUP BY product_category
   ORDER BY total_sales DESC
   LIMIT 1;
Enter fullscreen mode Exit fullscreen mode

Result:

| product_category | total_sales |

|------------------|-------------|

| Electronics | 550 |

  1. Average Order Value by Category:
   SELECT product_category, AVG(total) AS avg_order_value 
   FROM orders
   GROUP BY product_category;
Enter fullscreen mode Exit fullscreen mode

Result:

| product_category | avg_order_value |

|------------------|-----------------|

| Electronics | 275 |

| Furniture | 400 |

| Clothing | 150 |


Interview Preparation

  1. What is the difference between WHERE and HAVING?
  2. Write a query to find the second highest-selling product category.
  3. How does GROUP BY handle NULL values?
  4. Explain the use of aggregate functions in reporting.

Outcome for the Day

By the end of Day 5, you should:

  • Master aggregate functions and their applications.
  • Group and summarize data using GROUP BY and HAVING.
  • Write queries to analyze and visualize key metrics for your project.

In Day 6, we’ll explore Database Relationships and Constraints, building a robust foundation for efficient and reliable database designs.

Would you like to explore visualization tools to represent these SQL results? Let me know!

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

Top comments (0)

Billboard image

Try REST API Generation for Snowflake

DevOps for Private APIs. Automate the building, securing, and documenting of internal/private REST APIs with built-in enterprise security on bare-metal, VMs, or containers.

  • Auto-generated live APIs mapped from Snowflake database schema
  • Interactive Swagger API documentation
  • Scripting engine to customize your API
  • Built-in role-based access control

Learn more

πŸ‘‹ Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.

Okay