DEV Community

Cover image for Clustered vs Non-Clustered Index in SQL(Complete Guide with Examples)
Mohammad Atif
Mohammad Atif

Posted on

Clustered vs Non-Clustered Index in SQL(Complete Guide with Examples)

1. Introduction: Why Do We Need Indexes?

Speed and efficiency are key when it comes to querying a database. An index is a data structure that acts as a guide for your database, giving you quick access to a particular data set to speed up your queries significantly.

To understand indexes, let’s consider a couple of analogies.

  • The Book: You don’t want to go through each page of a book to find a particular chapter or data; instead, you look in the back of the book to find that chapter quickly.
  • The Hotel: You don’t want to knock on every hotel room on every floor to find room 5001; instead, you look at a map in the lobby to quickly locate that room.

2. The Foundation: Data Pages and The Heap Structure

Before we move on to indexes, we should know how SQL stores data logically as well as physically.

Data Pages:

When you create a new table in your SQL database and start adding data to it, SQL doesn’t store data as a spreadsheet; instead, it stores data in data files on your hard disk, divided into fixed-size 8-Kilobyte blocks called “Pages.”

A Data Page Consists Of:

  • Page Header: This contains metadata related to that particular page.
  • Data Rows: This contains your actual data. The amount of data that you can store here is determined by your data size.
  • Offset Array: This is a small internal map that quickly tracks where your data starts on that particular page.

Data Page Structure. Diagram of a SQL Server data page showing page header, multiple data rows, free space area, and offset array at the bottom indicating row positions within an 8KB page.

The Heap Structure (No Index)

If you don't create an index on your table, your data is stored in a Heap Structure.

Write Logic: When new data is added to your table, SQL just throws it onto the next available Data Page in the order it was written. They're not sorted.

Multiple Data Pages (Heap Example). Illustration of multiple unsorted SQL data pages (1:100, 1:101, 1:102, 1:103) containing customer records stored sequentially without logical ordering, representing heap storage.

Read Logic (Full Table Scan): Because your data is unsorted, when you want to read a particular record, say Customer ID 14, SQL doesn't know where it is. It has to perform a Full Table Scan, reading every single Data Page until it finds your record.

Trade-off: Fast write speed (no sorting necessary), terrible read speed.


3. The Clustered Index

A Clustered Index changes everything about how your data is stored on your hard disk.

Physical Sorting: 

A Clustered Index on a particular column (e.g., User_ID) causes SQL to physically sort ALL existing data pages on that column from lowest to highest.

Sequential Data Pages Example. Illustration of four SQL Server data pages (1:100, 1:101, 1:102, 1:103) containing sequentially stored records (IDs 1–20) across multiple pages, demonstrating how rows are distributed across physical data pages in storage.

The B-Tree Logic: 

SQL uses a default data structure called a B-Tree (Balanced Tree) to navigate this sorted data:

  • Root Node (Top): An index page that contains a pointer to intermediate pages.
  • Intermediate Nodes: Index pages that contain a pointer to a particular set of data (e.g., "IDs 1-10 go left, IDs 11-20 go right").
  • Leaf Nodes (Bottom): In a Clustered Index, this is your actual Data Page.

B-Tree Hierarchical Structure. Diagram of a balanced B-Tree used in SQL indexing, showing a root node at the top, multiple intermediate nodes branching below, and leaf nodes at the bottom level representing final data access points.

Clustered Index Navigation Example. Illustration of a clustered index B-Tree where index pages (1:300, 1:200, 1:201) route value ranges (1–10, 11–20) down to sorted data pages (1:100, 1:101, 1:102, 1:103), demonstrating how SQL Server navigates through index pages to retrieve ordered records efficiently.

Characteristics: Limit: You can only have ONE Clustered Index per table because your data can only be sorted in ONE way.

Ideal Candidate: 

Primary Keys are perfect for Primary Keys. They are unique, and they are never updated. (Updating a clustered index column requires the database to move the row to maintain the sort order, which is very resource intensive).


4. The Non-Clustered Index

A Non-Clustered Index makes reads faster but never changes the physical order of the underlying table.

The Pointer Structure

When you create a non-clustered index, SQL leaves the actual Data Pages exactly as they are, whether they are a Heap or sorted by a clustered index. It simply creates an entirely separate B-Tree structure.

The B-Tree Logic

Leaf Nodes: Unlike the clustered index, these leaf nodes are Index Pages, not Data Pages. These leaf nodes are simply a list of the actual index values, along with a Row Identifier (RID) [00:17:42]. This RID is an exact address, like File ID, Page Number, and then an actual Offset, which points to where the actual row lives in the actual table.

Navigation: To retrieve your actual row, SQL uses the non-clustered index B-Tree, navigates down to the leaf node, finds the RID, and then makes one "jump" to the actual Data Page to retrieve the actual row.

Non-Clustered Index Structure. Non-clustered index B-Tree showing index pages containing key values and row identifiers (RIDs) that point to separate data pages.

Simplified Non-Clustered Pointer Flow. Diagram illustrating how index pages reference multiple data pages via pointers, demonstrating logical separation between index structure and physical data storage.

Characteristics

  • Limit: You may have multiple non-clustered indexes on one single table.
  • Ideal Candidate: Columns you often use in your WHERE clause, like "Last_Name," or often use in your JOIN clause.

5. Direct Comparison: Clustered vs. Non-Clustered

Clustered vs Non-Clustered Comparison Table. Side-by-side comparison chart of clustered and non-clustered indexes highlighting definition, number of indexes allowed, read performance, write performance, storage efficiency, and use cases.


6. SQL Syntax and Implementation

Creating indexes in SQL Server uses straightforward syntax. By default, if Primary Keys are defined, SQL automatically creates a clustered index for them and uses B-Tree as default data structure.

Creating a Clustered Index:

CREATE CLUSTERED INDEX idx_customers_id ON sales.customers (customer_id);

Creating a Non-Clustered Index:

CREATE NONCLUSTERED INDEX idx_customers_lastname ON sales.customers (last_name);

(Note: If you do not specify "NONCLUSTERED", SQL Server defaults to Non-Clustered.)


I have another post in which I have talked about how to achieve best unique identifier key for your table's cluster index for the fastest Read/Write. Click Me to redirect to the relevant post.

Top comments (0)