DEV Community

Cover image for SQL Performance - composite indexes
Rafal Hofman
Rafal Hofman

Posted on • Originally published at brightinventions.pl

2

SQL Performance - composite indexes

Recently I have been reading SQL Performance Explained by Markus Winand and I wanted to share with you what I have learned about the composite index.

You are probably familiar with the concept of the index in the Database. In very simple terms, you can imagine that it is similar to telephone directory (B-tree structure in DB) - instead of traversing through the whole book (DB), you are using directory (DB index) to find it faster. For more about indexes, you can read here.

Imagine you have table employees with columns id, department, name.

As you have some SELECT queries accessing those fields, you have created index for department and index for name:

CREATE INDEX department_index ON employees(department);

CREATE INDEX name_index ON employees(name);

It works well for queries like:

SELECT * FROM "employees" WHERE "department" = 'IT';

or

SELECT * FROM "employees" WHERE "name" = 'Rafal';

Now, imagine you have a query that selects both department and surname:

SELECT * FROM "employees" WHERE "department" = 'IT' AND "name" = 'Rafal';

You already have an index on those fields, right? So what is the problem?

DB engine will use only one of the indexes you have created. Going back to the example with telephone directory - you have created two separate directories - those are two separate structures and you either look in one or another. When two fields are selected, one of the fields will be selected to choose index structure and then found in this index to select the results. The second field will be traversed based on the selected results. You can imagine with department index example: department (IT) index will be selected and then the name (Rafal) entry will be searched for in the leaf nodes of index. So if the name column will consit of "Agata, Tomek, Rafal, Zenek, (...)" it will traverse through several entries to find a correct entry (Rafal).

A Solution for that would be to use a composite index. You can create it like:
CREATE INDEX composite_index_name on employees(department, name)

It will then create an index on two columns. With telephone directory example - you will have a directory that consists of both of those informations and directs to the right place. So query:

SELECT * FROM "employees" WHERE "department" = 'IT' AND "name" = 'Rafal';

will hit exactly one entry in the index with one leaf node.

Might be useful if you are using heavy queries used at hot spots in the system and performance there is crucial.

DISCLAIMER: Be aware indexes also do use storage and every INSERT/UPDATE needs to update the index structure as well :)

Image of Datadog

The Future of AI, LLMs, and Observability on Google Cloud

Datadog sat down with Google’s Director of AI to discuss the current and future states of AI, ML, and LLMs on Google Cloud. Discover 7 key insights for technical leaders, covering everything from upskilling teams to observability best practices

Learn More

Top comments (1)

Collapse
 
dansilcox profile image
Dan Silcox

Nice summary, not sure if this applies to every database server but certainly for MySQL be aware that the optimiser uses the composite index fields from left to right e.g. if you have a composite (department,name) like your example, but then in your WHERE, only filter with name (i.e. not department), it will not use the index.

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay