DEV Community

Collins Njeru
Collins Njeru

Posted on

SQL IS NOT BAD - It's Running the World And Most Developers Don't Know Half of It....

"Every database, every app, every swipe of your card — SQL is there."

The language that outlived every trend since 1974

Written for developers, analysts, and data enthusiasts at every level

From zero to advanced — in one article
Last updated: 2026 | PostgreSQL-first, broadly compatible

"In an era obsessed with the newest JavaScript framework or the latest AI buzzword, SQL has done something remarkable , it has outlived every trend thrown at it since 1974."*


NoSQL was supposed to kill it.

Big Data was supposed to replace it.

Pandas, Spark, and cloud warehouses were going to make it irrelevant.

Yet here we are. SQL is baked into everything from PostgreSQL and MySQL to Snowflake, BigQuery, and DuckDB.

SQL is dying. ← False. SQL is thriving.

If you work with data — in any capacity — SQL fluency is not optional. It's oxygen.


Table of Contents

  1. The Anatomy of a SQL Query
  2. JOINs Demystified
  3. Aggregation and GROUP BY
  4. Window Functions — SQL's Hidden Superpower
  5. Subqueries vs CTEs
  6. Indexing — Why Your Queries Are Slow
  7. Transactions and ACID
  8. 10 SQL Mistakes Hurting Your Performance
  9. Advanced SQL Patterns
  10. The Future of SQL

1. The Anatomy of a SQL Query

Before we run, let's walk. A SQL query has a very specific logical execution order that trips up even experienced developers.

The order you write a query:

SELECT name, COUNT(*) AS total_orders
FROM customers
JOIN orders ON customers.id = orders.customer_id
WHERE customers.country = 'Kenya'
GROUP BY name
HAVING COUNT(*) > 5
ORDER BY total_orders DESC
LIMIT 10;
Enter fullscreen mode Exit fullscreen mode

The order the database executes it:

1. FROM       → Identify the tables
2. JOIN       → Combine related data
3. WHERE      → Filter rows (before grouping)
4. GROUP BY   → Collapse rows into groups
5. HAVING     → Filter groups (after aggregation)
6. SELECT     → Choose columns to return
7. ORDER BY   → Sort the result
8. LIMIT      → Cut result to N rows
Enter fullscreen mode Exit fullscreen mode

Key insight: WHERE runs before GROUP BY. This means you cannot filter on an aggregated column using WHERE. That's what HAVING is for.

Nested tip: Get this order wrong, and you'll spend hours debugging a query that looks perfectly fine.

What You Should Know Before Writing Any Query

  • Always understand the shape of your data first
  • Never assume column types — check them
  • Use EXPLAIN before running on large tables
  • Limit your results while exploring

Your SQL Query Checklist

  • [x] Understand the execution order
  • [x] Know which table is the "driving" table
  • [x] Add a LIMIT when exploring
  • [ ] Run EXPLAIN ANALYZE to check the query plan
  • [ ] Check indexes on JOIN and WHERE columns
  • [ ] Peer-review complex queries before production

2. JOINs Demystified

JOINs are the heart of relational databases. If you understand them deeply, you understand SQL.

The Main JOIN Types

JOIN Type What It Returns Use Case
INNER JOIN Rows matching in both tables Most common join
LEFT JOIN All left rows + matched right rows Find unmatched records
RIGHT JOIN All right rows + matched left rows Rare — prefer LEFT JOIN
FULL OUTER JOIN All rows from both tables Complete data comparison
CROSS JOIN Cartesian product of both tables Combinations/permutations
SELF JOIN A table joined to itself Hierarchies, org charts

Visual Mental Model

Think of two overlapping circles:

  • INNER JOINonly the overlap
  • LEFT JOIN → entire left circle + overlap
  • FULL OUTER JOIN → both circles entirely

A Common Real-World Example

-- Find customers who have NEVER placed an order
SELECT c.name, c.email
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
WHERE o.customer_id IS NULL;
Enter fullscreen mode Exit fullscreen mode

The trick of using LEFT JOIN ... WHERE right_table.id IS NULL to find non-matching rows is one of the most elegant patterns in SQL. Learn it. Use it often.

The SELF JOIN — More Useful Than You Think

-- Find employees and their managers (from the same table)
SELECT 
    e.name AS employee,
    m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
Enter fullscreen mode Exit fullscreen mode

3. Aggregation and GROUP BY

Aggregation is where SQL transforms raw rows into business intelligence.

Core Aggregate Functions

SELECT
    department,
    COUNT(*)                    AS total_staff,
    AVG(salary)                 AS avg_salary,
    MAX(salary)                 AS highest_salary,
    MIN(salary)                 AS lowest_salary,
    SUM(salary)                 AS total_payroll,
    COUNT(DISTINCT job_title)   AS unique_roles
FROM employees
GROUP BY department
ORDER BY total_payroll DESC;
Enter fullscreen mode Exit fullscreen mode

The Golden Rule of GROUP BY

Every column in SELECT must either be in GROUP BY or be inside an aggregate function.

Wrong vs Right

--  WRONG — 'name' is neither grouped nor aggregated
SELECT department, name, COUNT(*)
FROM employees
GROUP BY department;

--  CORRECT
SELECT department, COUNT(*) AS headcount
FROM employees
GROUP BY department;
Enter fullscreen mode Exit fullscreen mode

Filtering Groups with HAVING

-- Only departments with more than 10 employees earning above $50,000
SELECT department, COUNT(*) AS headcount
FROM employees
WHERE salary > 50000          -- filters rows BEFORE grouping
GROUP BY department
HAVING COUNT(*) > 10          -- filters groups AFTER aggregation
ORDER BY headcount DESC;
Enter fullscreen mode Exit fullscreen mode

4. Window Functions — SQL's Hidden Superpower

If there is one feature that separates SQL beginners from SQL professionals, it is window functions.

Window functions let you perform calculations across related rows — without collapsing the result like GROUP BY does.

The Syntax

function_name() OVER (
    PARTITION BY column     -- Optional: group rows into windows
    ORDER BY column         -- Optional: define order within the window
    ROWS BETWEEN ...        -- Optional: define the window frame
)
Enter fullscreen mode Exit fullscreen mode

Ranking Functions

SELECT
    name,
    department,
    salary,
    RANK()        OVER (PARTITION BY department ORDER BY salary DESC) AS rank,
    DENSE_RANK()  OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank,
    ROW_NUMBER()  OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees;
Enter fullscreen mode Exit fullscreen mode

Difference Between Ranking Functions

Function Handles Ties Example Output
ROW_NUMBER() Always unique 1, 2, 3, 4
RANK() Skips after tie 1, 1, 3, 4
DENSE_RANK() No skip after tie 1, 1, 2, 3

Running Totals and Moving Averages

-- Running total of sales with a 7-day moving average
SELECT
    sale_date,
    amount,
    SUM(amount) OVER (ORDER BY sale_date) AS running_total,
    AVG(amount) OVER (
        ORDER BY sale_date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS moving_7day_avg
FROM sales;
Enter fullscreen mode Exit fullscreen mode

LAG and LEAD — Comparing to Neighboring Rows

-- Month-over-month revenue growth
SELECT
    month,
    revenue,
    LAG(revenue, 1)  OVER (ORDER BY month) AS prev_month,
    LEAD(revenue, 1) OVER (ORDER BY month) AS next_month,
    revenue - LAG(revenue, 1) OVER (ORDER BY month) AS growth
FROM monthly_revenue;
Enter fullscreen mode Exit fullscreen mode

Window functions are the reason SQL is still king for analytics. No other tool lets you express this kind of computation as cleanly.


5. Subqueries vs CTEs

The Subquery — Powerful but Messy at Scale

A query nested inside another query:

SELECT name, total_spent
FROM (
    SELECT customer_id, SUM(amount) AS total_spent
    FROM orders
    GROUP BY customer_id
) AS customer_totals
JOIN customers ON customers.id = customer_totals.customer_id
WHERE total_spent > (
    SELECT AVG(total_spent) FROM (
        SELECT SUM(amount) AS total_spent
        FROM orders
        GROUP BY customer_id
    ) AS sub
);
Enter fullscreen mode Exit fullscreen mode

This works — but grows unreadable fast.

The CTE (Common Table Expression) — Clean, Named, Reusable

CTEs let you name intermediate results using the WITH keyword:

WITH customer_totals AS (
    SELECT customer_id, SUM(amount) AS total_spent
    FROM orders
    GROUP BY customer_id
),
avg_spend AS (
    SELECT AVG(total_spent) AS avg_val FROM customer_totals
)
SELECT 
    c.name, 
    ct.total_spent
FROM customers c
JOIN customer_totals ct ON c.id = ct.customer_id
CROSS JOIN avg_spend
WHERE ct.total_spent > avg_spend.avg_val
ORDER BY ct.total_spent DESC;
Enter fullscreen mode Exit fullscreen mode

Prefer CTEs over nested subqueries for anything beyond one level. Your teammates — and your future self — will thank you.

Recursive CTEs — Conquering Hierarchical Data

-- Traverse an employee org chart tree
WITH RECURSIVE org_chart AS (
    -- Base case: the CEO (no manager)
    SELECT id, name, manager_id, 0 AS depth
    FROM employees
    WHERE manager_id IS NULL

    UNION ALL

    -- Recursive case: employees joined to their managers
    SELECT e.id, e.name, e.manager_id, oc.depth + 1
    FROM employees e
    JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT depth, name FROM org_chart ORDER BY depth, name;
Enter fullscreen mode Exit fullscreen mode

When to Use Each

  1. Subquery — one-off, simple, single-level nesting
  2. CTE — multi-step logic, readable complex queries
  3. Recursive CTE — trees, graphs, hierarchies

6. Indexing — Why Your Queries Are Slow

No SQL article is complete without talking about performance. And no performance topic matters more than indexing.

What Is an Index?

An index is a separate data structure (usually a B-tree) that lets the database find rows without scanning every record. Think of it as the index in the back of a textbook.

The Speed Difference Is Staggering

-- Without index — scanning 50 million rows
SELECT * FROM customers WHERE email = 'alice@example.com';
-- ⏱ Time: 8.4 seconds

-- Create the index
CREATE INDEX idx_customers_email ON customers(email);

-- With index — direct lookup
SELECT * FROM customers WHERE email = 'alice@example.com';
-- ⏱ Time: 0.003 seconds
Enter fullscreen mode Exit fullscreen mode

When to Create an Index

  • Columns used frequently in WHERE clauses
  • Columns used in JOIN conditions
  • Columns used in ORDER BY or GROUP BY on large tables
  • Foreign key columns

When Indexes Hurt

  • Columns with very low cardinality (e.g., is_active = TRUE/FALSE only)
  • Tables written to far more often than they're read
  • Too many indexes slow down INSERT, UPDATE, and DELETE

EXPLAIN — Your Best Debugging Friend

EXPLAIN ANALYZE
SELECT c.name, COUNT(o.id) AS order_count
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE c.country = 'Kenya'
GROUP BY c.name;
Enter fullscreen mode Exit fullscreen mode

Use EXPLAIN ANALYZE to see whether the database does a full table scan (Seq Scan) or uses an index (Index Scan). Never optimize blindly.


7. Transactions and ACID

Every time you move money, book a seat, or submit an order — you're trusting transactions to make sure nothing breaks mid-operation.

The ACID Properties

Property Meaning Real-World Guarantee
Atomicity All-or-nothing No partial transfers
Consistency Valid state → valid state Rules always enforced
Isolation Concurrent transactions don't interfere No dirty reads
Durability Committed data survives crashes Your data is safe

A Bank Transfer — The Classic Example

BEGIN TRANSACTION;

-- Step 1: Deduct from sender
UPDATE accounts SET balance = balance - 1000 WHERE user_id = 'alice';

-- Step 2: Add to receiver
UPDATE accounts SET balance = balance + 1000 WHERE user_id = 'bob';

-- If both succeed:
COMMIT;

-- If anything fails:
-- ROLLBACK;
Enter fullscreen mode Exit fullscreen mode

Without transactions: a server crash between Step 1 and Step 2 means Alice loses money that Bob never receives.

With transactions: either both steps happen, or neither does.


8. 10 SQL Mistakes Hurting Your Performance

SELECT * — Never in Production

Always name the columns you need. SELECT * fetches unnecessary data, breaks when schemas change, and prevents index-only scans.

The Top 10 Offenders

  1. SELECT * in production code — fetch only what you need
  2. Not filtering before joiningWHERE early, not after millions of rows are joined
  3. Functions on indexed columns in WHERE
   --  Kills the index
   WHERE YEAR(created_at) = 2024

   --  Preserves the index
   WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'
Enter fullscreen mode Exit fullscreen mode
  1. HAVING for non-aggregated filters — use WHERE before grouping
  2. OR on indexed columns — can prevent index use; try UNION ALL
  3. NOT IN with NULLs — returns no rows if subquery has a NULL; use NOT EXISTS
  4. Missing indexes on foreign keys — always index FK columns used in JOINs
  5. N+1 queries — never query inside a loop; use JOINs or batch queries
  6. Implicit type conversions — comparing VARCHAR to integers forces a cast on every row
  7. Forgetting LIMIT while exploring — 3 rows in dev can be 30 million in prod

9. Advanced SQL Patterns

UPSERT — Insert or Update in One Statement

-- PostgreSQL syntax
INSERT INTO users (id, email, login_count)
VALUES (42, 'alice@example.com', 1)
ON CONFLICT (id) DO UPDATE
SET login_count = users.login_count + 1,
    last_seen   = NOW();
Enter fullscreen mode Exit fullscreen mode

Pivot with CASE WHEN

-- Turn rows into columns
SELECT
    product,
    SUM(CASE WHEN quarter = 'Q1' THEN revenue ELSE 0 END) AS Q1,
    SUM(CASE WHEN quarter = 'Q2' THEN revenue ELSE 0 END) AS Q2,
    SUM(CASE WHEN quarter = 'Q3' THEN revenue ELSE 0 END) AS Q3,
    SUM(CASE WHEN quarter = 'Q4' THEN revenue ELSE 0 END) AS Q4
FROM sales
GROUP BY product;
Enter fullscreen mode Exit fullscreen mode

Find Duplicates

SELECT email, COUNT(*) AS occurrences
FROM users
GROUP BY email
HAVING COUNT(*) > 1
ORDER BY occurrences DESC;
Enter fullscreen mode Exit fullscreen mode

Gap Detection — Finding Missing IDs

SELECT id + 1 AS gap_start
FROM orders o
WHERE NOT EXISTS (
    SELECT 1 FROM orders WHERE id = o.id + 1
)
AND id < (SELECT MAX(id) FROM orders);
Enter fullscreen mode Exit fullscreen mode

Percentile Calculation

SELECT
    department,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) AS median_salary,
    PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY salary) AS p90_salary
FROM employees
GROUP BY department;
Enter fullscreen mode Exit fullscreen mode

10. The Future of SQL

Far from fading, SQL is evolving aggressively:

  • DuckDBin-process analytics SQL that runs on your laptop with serious performance
  • Snowflake / BigQuery / Redshift — cloud data warehouses where SQL scales to petabytes
  • dbt (data build tool) — SQL as software engineering, with version control and testing
  • SQL on JSON — modern databases now query deeply nested JSON natively
  • AI-assisted SQL — LLMs generate SQL, but knowing it deeply makes you 10x better at validating that output

SQL will be the last programming language to die.

And it won't.


Your SQL Learning Roadmap

Beginner

  • [ ] SELECT, FROM, WHERE basics
  • [ ] INNER JOIN and LEFT JOIN
  • [ ] GROUP BY and aggregate functions
  • [ ] ORDER BY and LIMIT

Intermediate

  • [ ] All JOIN types
  • [ ] Subqueries and CTEs
  • [ ] HAVING vs WHERE
  • [ ] NULL handling

Advanced

  • [ ] Window functions (RANK, LAG, LEAD, running totals)
  • [ ] Recursive CTEs
  • [ ] Indexing strategy and EXPLAIN
  • [ ] Transactions and isolation levels
  • [ ] Query optimization techniques

Conclusion

SQL is not just a tool. It's a way of thinking about data — relationally, declaratively, and powerfully.

The developers and analysts who master SQL don't just write better queries. They:

  • Think more clearly about data problems
  • Debug faster
  • Communicate better across engineering, product, and business teams

The language is 50+ years old and still deeply relevant. That is not an accident — it's proof.

Start writing better SQL today. The data is waiting.


Top comments (0)