DEV Community

Cover image for Introduction to Databases with SQL
Brian Muriithi
Brian Muriithi

Posted on

Introduction to Databases with SQL

A hands-on walkthrough of DDL, DML, DQL using a real school project

I recently completed my first SQL project assignment: building a working school database from scratch for Nairobi Academy (PS data used in this project is fictitious). That meant creating tables, populating them with data, running queries, and writing logic to label and categorize results.

This article walks through what I learned, the SQL concepts behind it, and code from the actual project. Whether you're just starting out or refreshing your fundamentals, I hope this makes things clear.

The Big Picture: DDL vs DML vs DQL

Before writing a single line of SQL, it helps to understand the different categories of commands you'll use constantly.

SQL categories

DDL (Data Definition Language) is for building and modifying the structure of a database. Think of it as the architecture work - laying the foundation before you move any furniture in. The main commands are CREATE, ALTER, DROP, and TRUNCATE.

DML (Data Manipulation Language) is for working with the data inside those structures. Once the rooms are built, DML is how you fill them, rearrange them, or clear them out. The main commands are INSERT, UPDATE, and DELETE.

Here's a simple way to remember the difference:

Category Question it answers Commands
DDL What does the database look like? CREATE, ALTER, DROP, TRUNCATE
DML What data is in the database? INSERT, UPDATE, DELETE

There's also DQL (Data Query Language) - specifically SELECT - which is how you read data. We also do have TCL (Transaction Control Language) which includes commands like COMMIT, SAVEPOINT, and ROLLBACK. Finally we do have DCL (Data Control Language) with commands like GRANT and REVOKE. We will not be covering these last two in this walkthrough.

Part 1 - Building the Database (DDL in Action)

Setting Up a Schema

A schema is a logical container that groups related tables together. For this project, everything lives inside nairobi_academy.

CREATE SCHEMA nairobi_academy;
SET search_path TO nairobi_academy;
Enter fullscreen mode Exit fullscreen mode

SET search_path tells PostgreSQL to look inside nairobi_academy by default, so you don't have to prefix every table name.

Creating the Tables

The school has three core tables: students, subjects, and exam_results.

CREATE TABLE students (
    student_id INT PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name  VARCHAR(50) NOT NULL,
    gender     VARCHAR(1),
    date_of_birth DATE,
    class      VARCHAR(10),
    city       VARCHAR(50)
);
Enter fullscreen mode Exit fullscreen mode

A few design decisions worth noting here:

  • student_id INT PRIMARY KEY - every row needs a unique identifier. Primary keys cannot be NULL and must be unique across the entire table.
  • NOT NULL on first_name and last_name - a student record without a name is useless, so the database enforces this at the structural level.
  • gender VARCHAR(1) - a single character stores 'M' or 'F' efficiently.
  • Phone numbers (when they appeared later) are stored as VARCHAR, not a number type. This matters because phone numbers can start with a zero and may include symbols like +. Storing them as integers would break both of those cases.
CREATE TABLE subjects (
    subject_id   INT PRIMARY KEY,
    subject_name VARCHAR(100) NOT NULL UNIQUE,
    department   VARCHAR(50),
    teacher_name VARCHAR(100),
    credits      INT
);

CREATE TABLE exam_results (
    result_id  INT PRIMARY KEY,
    student_id INT NOT NULL,
    subject_id INT NOT NULL,
    marks      INT NOT NULL,
    exam_date  DATE,
    grade      VARCHAR(2)
);
Enter fullscreen mode Exit fullscreen mode

UNIQUE on subject_name means the database won't allow two subjects with the same name - good data hygiene.

Modifying Tables with ALTER

After the initial build, the school needed to add a phone number column, then rename a column, and then remove the phone number column entirely. This is a normal project requirement change.

-- Add a column
ALTER TABLE students ADD COLUMN phone_number VARCHAR(20);

-- Rename a column
ALTER TABLE subjects RENAME COLUMN credits TO credit_hours;

-- Remove a column
ALTER TABLE students DROP COLUMN phone_number;
Enter fullscreen mode Exit fullscreen mode

ALTER TABLE is the tool for any structural change to an existing table. The key thing to remember: dropping a column is permanent. The data in it is gone.

Part 2 - Filling the Database (DML in Action)

INSERT - Adding Data

Once the structure exists, it's time to populate it. Here's the student data going in:

INSERT INTO students (student_id, first_name, last_name, gender, date_of_birth, class, city)
VALUES
    (1,  'Amina',   'Wanjiku', 'F', '2008-03-12', 'Form 3', 'Nairobi'),
    (2,  'Brian',   'Ochieng', 'M', '2008-07-25', 'Form 4', 'Mombasa'),
    (3,  'Cynthia', 'Mutua',   'F', '2008-11-25', 'Form 3', 'Kisumu'),
    (4,  'David',   'Kamau',   'M', '2008-02-18', 'Form 4', 'Nairobi'),
    (5,  'Esther',  'Akinyi',  'F', '2008-06-30', 'Form 2', 'Nakuru'),
    (6,  'Felix',   'Otieno',  'M', '2008-09-14', 'Form 2', 'Eldoret'),
    (7,  'Grace',   'Mwangi',  'F', '2008-01-22', 'Form 3', 'Nairobi'),
    (8,  'Hassan',  'Adbi',    'M', '2008-04-09', 'Form 4', 'Mombasa'),
    (9,  'Ivy',     'Chebet',  'F', '2008-12-10', 'Form 2', 'Nakuru'),
    (10, 'James',   'Kariuki', 'M', '2008-08-17', 'Form 3', 'Nairobi');
Enter fullscreen mode Exit fullscreen mode

One INSERT statement can add multiple rows at once by separating each row's values with a comma. This is much cleaner than writing ten separate statements.

UPDATE - Correcting Records

Two things needed fixing after the initial data entry: a student moved cities, and an exam mark was recorded wrong.

-- Esther moved from Nakuru to Nairobi
UPDATE students
SET city = 'Nairobi'
WHERE student_id = 5;

-- Fix an incorrectly entered mark
UPDATE exam_results
SET marks = 59
WHERE result_id = 5;
Enter fullscreen mode Exit fullscreen mode

The WHERE clause here is critical. Without it, every single row in the table would be updated. Always double-check your WHERE condition before running an UPDATE.

DELETE - Removing a Record

One exam result was cancelled entirely:

DELETE FROM exam_results
WHERE result_id = 9;
Enter fullscreen mode Exit fullscreen mode

Same rule applies - WHERE is your safety net. DELETE FROM exam_results with no condition would wipe the entire table.

Part 3 - Querying the Data

With data in place, the real work begins: asking questions of the database.

Basic Filtering with WHERE

-- All Form 4 students
SELECT * FROM students
WHERE class = 'Form 4';

-- Female students only
SELECT * FROM students
WHERE gender = 'F';

-- Exam results with marks of 70 or above
SELECT * FROM exam_results
WHERE marks >= 70;
Enter fullscreen mode Exit fullscreen mode

WHERE filters which rows come back. Only rows where the condition is TRUE are returned.

Combining Conditions: AND / OR

-- Form 3 students who are specifically in Nairobi
SELECT *
FROM students
WHERE (city = 'Nairobi' AND class = 'Form 3');

-- Students in Form 2 or Form 4
SELECT *
FROM students
WHERE (class = 'Form 2' OR class = 'Form 4');
Enter fullscreen mode Exit fullscreen mode

AND means both conditions must be true. OR means at least one must be true.

One important tip: when mixing AND and OR in the same query, use parentheses to make your logic explicit. Without them, SQL's precedence rules may not do what you expect.

Part 4 - BETWEEN, IN, and LIKE

These three operators make filtering much easier.

BETWEEN - Range Queries

-- Exam results with marks between 50 and 80 (inclusive)
SELECT *
FROM exam_results
WHERE marks BETWEEN 50 AND 80;

-- Exams that took place over a specific date range
SELECT *
FROM exam_results
WHERE exam_date BETWEEN '2024-03-15' AND '2024-03-18';
Enter fullscreen mode Exit fullscreen mode

BETWEEN is inclusive on both ends - so BETWEEN 50 AND 80 includes both 50 and 80 themselves. It works on numbers and dates equally well.

IN and NOT IN - Membership Checks

-- Students from specific cities
SELECT *
FROM students
WHERE city IN ('Nairobi', 'Mombasa', 'Kisumu');

-- Students NOT in Form 2 or Form 3
SELECT *
FROM students
WHERE class NOT IN ('Form 2', 'Form 3');
Enter fullscreen mode Exit fullscreen mode

IN is cleaner than writing multiple OR conditions. Instead of WHERE city = 'Nairobi' OR city = 'Mombasa' OR city = 'Kisumu', you get one compact line.

LIKE - Pattern Matching

-- Students whose first name starts with 'A' or 'E'
SELECT *
FROM students
WHERE first_name LIKE 'A%'
    OR first_name LIKE 'E%';

-- Subjects containing the word 'Studies'
SELECT subject_name
FROM subjects
WHERE subject_name LIKE '%Studies%';
Enter fullscreen mode Exit fullscreen mode

The % is a wildcard that matches any sequence of characters. So:

  • 'A%' means "starts with A"
  • '%Studies%' means "contains the word Studies anywhere"
  • '%s' means "ends with s"

Part 5 - COUNT

Sometimes you don't need the rows themselves - you just need to know how many there are.

-- How many students are in Form 3?
SELECT COUNT(class) AS form_3_students
FROM students
WHERE class = 'Form 3';

-- How many exam results scored 70 or above?
SELECT COUNT(*) AS above_70_students
FROM exam_results
WHERE marks >= 70;
Enter fullscreen mode Exit fullscreen mode

COUNT(*) counts every row that matches the condition. COUNT(column_name) counts rows where that column is not NULL. For most counting purposes, COUNT(*) is what you want.

Part 6 - CASE WHEN: Adding Logic to Queries

CASE WHEN lets you create new columns based on conditions - like an if/else block inside a SQL query.

Labeling Exam Performance

SELECT
    result_id,
    marks,
    grade,
    CASE
        WHEN marks >= 80 THEN 'Distinction'
        WHEN marks >= 60 THEN 'Merit'
        WHEN marks >= 40 THEN 'Pass'
        ELSE 'Fail'
    END AS performance
FROM exam_results;
Enter fullscreen mode Exit fullscreen mode

SQL evaluates each WHEN condition from top to bottom and stops at the first one that's true. So a student with 85 marks hits marks >= 80 first and gets 'Distinction', even though marks >= 60 is also technically true for them.

Classifying Students by Year Group

SELECT
    student_id,
    first_name,
    class,
    CASE
        WHEN (class = 'Form 3' OR class = 'Form 4')
        THEN 'Senior'
        ELSE 'Junior'
    END AS student_level
FROM students;
Enter fullscreen mode Exit fullscreen mode

Notice the parentheses around class = 'Form 3' OR class = 'Form 4'. Without them inside a CASE WHEN that might have other conditions, the logic can go wrong. It's a small thing that can cause subtle bugs, so wrapping OR conditions in parentheses is a good habit.

I hope you learned a thing or two from this walkthrough. If you like to learn more about Data Science and Analytic, please give me a follow.

This article was written as part of a SQL assignment for the Nairobi Academy project. The full SQL script is available on GitHub. Link on my bio

Top comments (0)