SQL (Structured Query Language) remains the backbone of data retrieval and manipulation in the world of relational databases.
Whether you’re a data analyst, data engineer, or business intelligence professional, learning SQL unlocks the ability to explore data, uncover insights, and build analytics pipelines with ease.
In this article, part of our SQL Learning Series, you’ll go from SQL basics to advanced SQL techniques using real-world examples.
What is a Relational Database Management System (RDBMS)?
A relational database management system (RDBMS) is a program that allows you to create, update, and administer a relational database.
Relational databases store data in tables(rows and columns).
Relationships between tables are established using keys:
Primary Key (PK): Uniquely identifies each row in a table.
Foreign Key (FK): Refers to the primary key in another table to maintain relationships.
What the heck is SQL?
SQL stands for Structured Query Language.
It’s declarative, meaning you tell the database what you want, and it figures out how to get it.
Three core categories of SQL commands:
- DDL (Data Definition Language) → Create or modify database objects like tables(CREATE, ALTER).
- DML (Data Manipulation Language) → Insert, update, or delete data.
- DQL (Data Query Language) → Retrieve data with
SELECT
queries.
These form the foundation for both data exploration and advanced analytics tasks.
Writing Your First SQL Queries
Let’s look at some beginner-friendly SQL commands:
CREATE TABLE friends (
id INTEGER,
name TEXT,
birthday DATE,
)
INSERT INTO friends (id, name, birthday)
VALUES (1, 'Ororo Munroe', '1940-05-30');
SELECT *
FROM friends;
UPDATE friends
SET name = 'Storm'
WHERE id = 1;
ALTER TABLE friends
ADD COLUMN email TEXT;
UPDATE friends
SET email = storm@codecademy.com
WHERE id = 1;
DELETE FROM friends
WHERE id = 1;
SELECT *
FROM friends;
Ah, don't worry, I'm going to explain it all like you're five:
-
CREATE TABLE
is a clause that tells SQL you want to create a new table. -
friends
is the name of the table. - (id INTEGER, name TEXT, birthday DATE) is a list of parameters defining each column, or attribute in the table and its data type.
-
The
INSERT
statement inserts a new row into a table:-
(id, name, age)
is a parameter identifying the columns that data will be inserted into. -
VALUES
is a clause that indicates the data being inserted.
-
-
SELECT
statements are used to fetch data from a database.-
*
is a special wildcard character that allows you to select every column in a table without having to name each one individually.
-
-
The
ALTER TABLE
statement is used to add or modify table columns.-
ADD COLUMN
is a clause that lets you add a new column to a table.
-
-
The
UPDATE
statement edits a row in a table.- You can use the
UPDATE
statement when you want to change existing records. -
WHERE
is a clause that indicates which row(s) to update with the new column value.
- You can use the
-
The
DELETE FROM
statement deletes one or more rows from a table.-
WHERE
is a clause that lets you select which rows you want to delete.
-
Pro tip: SQL commands are not case-sensitive. select = SELECT
.
Adding Constraints for Data Integrity
Constraints define rules for your data.
Example: The statement below sets constraints on the awards
table.
CREATE TABLE awards (
id INTEGER PRIMARY KEY,
recipient TEXT NOT NULL,
award_name TEXT DEFAULT 'Grammy'
);
- PRIMARY KEY → Uniquely identifies each record.
- NOT NULL → Prevents empty values.
- DEFAULT → Sets a fallback value if none is provided.
This ensures data stays accurate and consistent.
Database Indexes: Speeding Things Up
Indexes are a powerful tool used in the background of a database to speed up querying. Simply put, an index is a pointer to data in a table.
How are indexes created?
In a database, data is stored in rows
which are organized into tables
. Each row
has a unique key which distinguishes it from all other rows and those keys are stored in an index for quick retrieval.
When a new row with a unique key is added, the index updates automatically.
However, sometimes we need to quickly look up data that isn’t stored as a key—for example, finding customers by telephone number
. Using a unique constraint isn’t suitable in such cases because multiple rows can share the same value. Instead, custom indexes can be created.
Creating an index:
CREATE INDEX <index_name>
ON <table_name> (column1, column2, ...)
This allows faster access to frequently queried columns even if they are not unique keys.
QUERIES
One of the core purposes of the SQL language is to retrieve information stored in a database. This is commonly referred to as querying.
We'll be using SQL commands to query a table of restaurants called 'quench' and generate insights out of it.
Start by getting a feel for the quench
table:
SELECT *
FROM quench;
Find Distinct Neighborhoods
SELECT DISTINCT neighborhood
FROM quench;
DISTINCT
removes duplicates, so each neighborhood appears only once.
Let’s say you’re craving Chinese food. To see all restaurants serving Chinese cuisine:
SELECT *
FROM quench
WHERE cuisine = 'Chinese';
The WHERE
clause filters rows based on a condition.
Filter by Review Score
Want restaurants rated 4 or higher?
SELECT *
FROM quench
WHERE review >= 4;
Conditions can use comparison operators like =, >=, <, or <> (not equal).
If you want to find Italian restaurants with exactly three dollar signs:
SELECT *
FROM quench
WHERE cuisine = 'Italian'
AND price = '$$$';
Both conditions must be true for rows to be included.
Search by Partial Name
Let's say you can’t remember the exact name of a restaurant you went to earlier but knows it contains the word ‘meatball’ in it.
How will you find it?
SELECT *
FROM quench
WHERE name LIKE '%meatball%';
LIKE
is a special operator used with the WHERE
clause to search for a specific pattern in a column.
%
is a wildcard character that matches zero or more missing characters in the pattern.
Filter by Multiple Locations
Find all the close by spots in Midtown
, Downtown
or Chinatown
.
SELECT name, cuisine, neighborhood
FROM quench
WHERE neighborhood = 'Midtown'
OR neighborhood = 'Downtown'
OR neighborhood = 'Chinatown';
Note: Operators like OR/AND
can be used more than once.
We selected only name
, cuisine
, and neighborhood
for a concise result set.
Find Missing Health Grades.
SELECT *
FROM quench
WHERE health IS NULL;
Note: IS NULL
checks for empty or missing values.
Top 10 Restaurants by Reviews
SELECT *
FROM quench
ORDER BY review DESC
LIMIT 10;
- ORDER BY sorts results.
- DESC = descending order (highest first).
- LIMIT restricts the number of rows returned.
Woah! That was a lot to digest—literally. Now I’m craving some actual food!
Aggregate Functions
When working with databases, you often need more than just raw data ;you need insights. That’s where SQL aggregate functions come in.
They allow you to perform calculations across multiple rows and return a single, meaningful result.
Some of the most commonly used aggregate functions include:
- COUNT() : count the number of rows
- SUM(): the sum of the values in a column
- MAX()/ MIN(): the largest/smallest value
- AVG(): the average of the values in a column
- ROUND(): round the values in the column
In addition, you can use GROUP BY
, ORDER BY
, and HAVING
clauses to organize and filter aggregated data effectively.
Example:
Imagine you own a chain of lemonade stands across different neighborhoods.
Every day, each stand sells some cups of lemonade. At the end of the month, you want answers to questions like:
- How much lemonade did we sell each day?
- Which days were really good (sold more than 100 cups)?
- Which days made the most sales so we can study what went right?
In SQL terms:
- Each stand’s daily sales = rows in our table
- Total cups sold per day = aggregate function (SUM)
- Filtering good days only = HAVING
- Sorting best days on top = ORDER BY
Step 1: Group the Sales Per Day (GROUP BY)
First, let’s sum up cups sold per day:
SELECT date, SUM(cups_sold) AS total_cups
FROM sales
GROUP BY date;
Here’s what happens:
GROUP BY
date → Puts all sales on the same day together
SUM(cups_sold)
→ Adds them up for each day
Step 2: Keep Only Good Days (HAVING)
Now we only want days with at least 100 cups sold:
SELECT
date,
SUM(cups_sold) AS total_cups
FROM sales
GROUP BY date
HAVING SUM(cups_sold) >= 100;
HAVING
is like a filter for groups.
WHERE
filters rows before grouping, but HAVING
filters after aggregates are calculated.
Step 3: Sort the Best Days (ORDER BY)
Finally, let’s sort days by total cups sold so the best days come first:
SELECT
date,
SUM(cups_sold) AS total_cups
FROM sales
GROUP BY date
HAVING SUM(cups_sold) >= 100
ORDER BY total_cups DESC;
Wrapping Up
In this post, we started with simple queries to explore data, then moved on to aggregate functions and powerful clauses like GROUP BY
, HAVING
, and ORDER BY
to summarize, filter, and sort results effectively.
- Mastering these tools gives you the ability to:
- Spot trends quickly
- Summarize large datasets in seconds
- Filter and organize information with precision
The next time you open a database, you’ll know exactly how to turn tables of numbers into actionable knowledge.
Top comments (1)
Great walkthrough! How would you incorporate JOINs, subqueries, and window functions into the examples here? Next, consider a post on query optimization—reading EXPLAIN plans, choosing indexes, and handling NULLs and date/time quirks.