๐ Table of Contents โ Database Optimizations: Indexing
- Introduction
- What is Indexing?
- How Queries Work Without Index
- How Indexing Works
- Types of Indexes
- When Indexing Helps
- When Indexing Hurts
- Real World Example: E commerce Search
- Best Practices
- Summary
๐ Database Optimizations: Indexing
Databases are the backbone of modern applications, powering everything from e-commerce websites to banking systems. But as the volume of data grows, retrieving records efficiently becomes challenging. One of the most fundamental optimization techniques is Indexing.
This article will walk you through indexing step by step โ starting with naive concepts and gradually moving toward real-world usage with SQL examples.
What is Indexing?
Think of a database as a giant book and each row as a page. If you want to find a word in the book without an index, youโd have to scan page by page โ this is called a full table scan.
But if the book has an index at the back, you can jump directly to the page number.
๐ Similarly, a database index is a data structure that helps the database quickly locate rows matching a query condition.
How Queries Work Without Index
Letโs say we have a table of employees:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(100),
Department VARCHAR(50),
Salary INT
);
INSERT INTO Employees VALUES
(1, 'Alice', 'HR', 50000),
(2, 'Bob', 'IT', 60000),
(3, 'Charlie', 'Finance', 55000),
(4, 'David', 'IT', 65000),
(5, 'Eva', 'HR', 52000);
Now, if we run:
SELECT * FROM Employees WHERE Department = 'IT';
Without an index, the database will:
- Scan each row in
Employees
. - Compare the
Department
column with"IT"
. - Return matching rows.
If Employees
has 10 million rows, this is slow!
How Indexing Works
By creating an index on the Department
column:
CREATE INDEX idx_department ON Employees(Department);
Now, the database creates a separate structure (often a B-Tree or Hash Map) that maps each department to the row locations.
When we query for "IT"
, the DB jumps directly to the index โ fetches rows โ skips scanning unnecessary rows.
Types of Indexes (With Examples)
Single Column Index
Indexes created on one column.
CREATE INDEX idx_salary ON Employees(Salary);
Used for:
SELECT * FROM Employees WHERE Salary > 60000;
Composite (MultiColumn) Index
Indexes created on multiple columns.
CREATE INDEX idx_dept_salary ON Employees(Department, Salary);
This helps queries like:
SELECT * FROM Employees WHERE Department = 'IT' AND Salary > 60000;
โ ๏ธ Rule: Order matters!
- Index
(Department, Salary)
works for queries filtering byDepartment
or both. - But if you only filter by
Salary
, the index may not be used effectively.
Unique Index
Ensures all values in a column are unique.
CREATE UNIQUE INDEX idx_email ON Employees(Name);
If you try to insert a duplicate value, it fails.
Full Text Index
Used for searching large text fields.
CREATE FULLTEXT INDEX idx_name ON Employees(Name);
Query:
SELECT * FROM Employees WHERE MATCH(Name) AGAINST('Alice');
Clustered vs Non Clustered Index
- Clustered Index: Rearranges the actual table rows to match the index order. (Only one per table, usually the primary key).
- Non-Clustered Index: Creates a separate index structure that points to the row location.
Example in SQL Server:
-- Clustered Index on EmployeeID
CREATE CLUSTERED INDEX idx_empid ON Employees(EmployeeID);
-- Non-Clustered Index on Department
CREATE NONCLUSTERED INDEX idx_dept ON Employees(Department);
When Indexing Helps
โ
Fast searches with WHERE
clauses
โ
Speeding up JOIN
operations
โ
Efficient sorting with ORDER BY
โ
Quick lookups for unique constraints
When Indexing Hurts
โ Takes extra disk space
โ Slows down INSERT
, UPDATE
, DELETE
(because indexes must also be updated)
โ Too many indexes โ query optimizer confusion
Real World Example: E commerce Search
Imagine an Orders table with millions of rows:
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
ProductID INT,
OrderDate DATE,
Status VARCHAR(20)
);
Query:
SELECT * FROM Orders WHERE CustomerID = 101 AND Status = 'Shipped';
๐ Without indexes: full table scan.
๐ With composite index:
CREATE INDEX idx_customer_status ON Orders(CustomerID, Status);
Now, results are instant.
Best Practices
- Index columns frequently used in WHERE, JOIN, ORDER BY, GROUP BY.
- Donโt index small tables (full scan is faster).
- Avoid indexing columns with high update frequency.
- Monitor with
EXPLAIN
(MySQL/PostgreSQL) orEXPLAIN PLAN
(Oracle) to see if indexes are used.
Summary
- Indexing improves query performance by avoiding full table scans.
- Different types of indexes (single-column, composite, clustered, full-text) suit different needs.
- Over-indexing can hurt performance, so balance is key.
๐ Think of indexing like a library catalog โ itโs not free to maintain, but it saves you tons of time when searching.
More Details:
Get all articles related to system design
Hastag: SystemDesignWithZeeshanAli
Git: https://github.com/ZeeshanAli-0704/SystemDesignWithZeeshanAli
Top comments (0)