DEV Community

Cover image for Speed up your queries with indexes
Helen Anderson
Helen Anderson

Posted on • Updated on • Originally published at helenanderson.co.nz

Speed up your queries with indexes

The classic analogy for indexes goes ... databases are like libraries.

Tables are like books stored in a library.

Rows are stored on pages of a book.

Flipping through a textbook page by page looking for that one page you need is going to take time. The same way scanning millions of rows is going be time-consuming and tedious. That's where indexes come in.

library


Why do I need an index?
Are there different types of indexes?
Where can I find my index once it's created?
What are good candidates for indexes?
Can't I do this later?
Do I need to do this at all?
How many indexes are too many?
This is no good to me, I use Oracle
Why didn't you explain everything about b-trees?
Is this the answer to all my performance problems?


Why do I need an index?

Indexes speed up performance by either ordering the data on disk so it's quicker to find your result or telling the SQL engine where to go to find your data. If you don't apply an index, the SQL engine will scan through every row one by one.

While this isn't necessarily a bad thing, as your database grows things could start to slow down.


Are there different types of indexes?

There are two main types in SQL Server:

Clustered Index - the contents page

  • Physically arranges the data on disk in a way that makes it faster to get to.
  • You can only apply one per table because the data can only be ordered one way.

indexes1

create clustered index [id_idx] --name of the index
on [dbo].[actor_registration](actor_id)
Enter fullscreen mode Exit fullscreen mode

Non-clustered Index - the index at the back of a book.

  • These create a lookup that points to where the data is.
  • You can create up to 999 but as each index carries overhead and maintenance, you'll probably want to stick to just a few.

indexes2

create nonclustered index [last_name_idx] --name of the index
on [dbo].[actor_registration](last_name)
Enter fullscreen mode Exit fullscreen mode

Where can I find my index once it's created?

You can find the indexes on a table by expanding the table where the index is, then expanding indexes.

indexes3

This is also where you can create an index using the wizard. In the example below the option for clustered index is now greyed out because we have one on this table already.

indexes4


What are good candidates for indexes?

ID columns, names, account numbers and others that have lots of changes. Ideally something unique, sequential that you are using in SELECTs and JOINs frequently.


Can't I do this later?

Sure, no problem. Sometimes it's good to get a handle on how you are querying the data and then add them later.

If you have no indexes on your table the data is stored in the order it comes in. This is called Heaped Storage and is effectively an expensive way of storing a spreadsheet.

Be aware indexes take time to apply if your tables are large by the time you get to this task it may take some time.


Do I need to do this at all?

There's no rule saying you should or shouldn't. The advantages of not adding indexes are that your INSERTs and UPDATEs will be faster and your database will be physically smaller.

If you do notice things getting slow, check out the Execution Plan for any suggestions and more information on where the effort is going to execute your query.

indexes5


How many indexes are too many?

As always, it depends. Too many indexes may slow down performance. Once you've created an index for your Primary Key and Unique Keys it will be up to you, the Execution Plan and perhaps your friendly DBA as to what you do next.


This is no good to me, I use Oracle

Lucky for you there's a great resource for you right here on Dev.to


Why didn't you explain everything about b-trees?

Because there's already some excellent content here on Dev.to that goes further on this topic and I'm looking to provide a beginners overview.


Is this the answer to all my performance problems?

Indexes need maintenance. They may improve performance initially, but need to be reviewed, updated and maintained as your database grows. They aren't a 'set and forget' magic bullet and should be reviewed, and deleted, as your requirements change.

Your best bet at first is to use the Execution Plan to view its suggestions or ask your friendly DBA to lend a hand.


Let me know what other key concepts you think would be useful for complete beginners and junior data analysts.


Read more


This post first appeared on helenanderson.co.nz

Top comments (37)

Collapse
 
nhh profile image
Niklas

Keep in mind, that indexes are only useful if you have a high variance in your data, like username or firstname, lastname. Something that is limited in its variabce, like a enum field - category for example which holds only 5 possible categories isnt the best choice to add an index 😊👍

Collapse
 
andytower_rus profile image
Andrey Alferov

It's not always true. Selective index is good but if you not index on column in foreign key you may can get big problem with locks.

Collapse
 
nhh profile image
Niklas

Can you explain this a little further? :)

Thread Thread
 
andytower_rus profile image
Andrey Alferov

Of course. For example table2 have column colF and foreign key to table1. If on colF not indexing and while table1 (any data) change we have exclusive table lock on table2. It's true for Oracle.
I have this problems many times.

Thread Thread
 
nhh profile image
Niklas

Cool, thank for sharing! I guess its kind if database agnostic, good to know for oracle! ✌️😊

Collapse
 
helenanders26 profile image
Helen Anderson

Great advice! That’s certainly what I do in the real world and more often than not it will be two or three columns not just one

Collapse
 
say_whaaaaaattt profile image
Hypertext

yeah for that you can use look up table

Collapse
 
wget profile image
William Gathoye • Edited

That execution plan sounds like a great feature especially as it is nicely expressed as time. This would allow me to check whether my schema is optimized for some SQL requests, or simply to see if the SQL request structures I use are not too time consuming (that would be fine for my lab assessment, good grades come in handy!).

Do you know if this is available in other DBMS? Even if you wrote Oracle users weren't lucky here, I tried to check for that DBMS (because we are forced to use Oracle at my university) and I wasn't able to find a solution as easy as with SQL Server (expressed as seconds). Any solution for MySQL/MariaDB? If that matter, I'm used to use the CLI, but also GUI tools like DBeaver and SQL Developer.

Collapse
 
kspeakman profile image
Kasey Speakman

Been a while, but I used to get the query execution plan in MySQL by simply putting EXPLAIN in front of it. I believe the same keyword is used in Postgres as well.

Collapse
 
ashleyjsheridan profile image
Ashley Sheridan

Presumably it's the same for MSSQL, but MySQL allows you to create indexes across multiple columns. This is very useful in a lot of situations where the identifying feature of the data is a combination of two or more things, e.g. user ID and a friend ID, where you'll likely have a row for each of the ID links.

Collapse
 
helenanders26 profile image
Helen Anderson

Nice catch! In the real world I’ll create indexes across more than one column, as you’ve described. But in the interests of simplicity kept it to just one in the examples

Collapse
 
dangolant profile image
Daniel Golant

Great post, Helen. Really missing Execution Plan now that I use Postgres. EXPLAIN is... fine, but EP was so nice.

Collapse
 
helenanders26 profile image
Helen Anderson

Good to know... I’m migrating to Postgres so it’ll be a bit of a change

Collapse
 
dangolant profile image
Daniel Golant

It’s more an aspect of your tools, I know data grip can do some impressive stuff, if you can afford it

Collapse
 
bengreenberg profile image
Ben Greenberg

This is great! One of the first things I did at my job was create indexes and that dramatically sped up our SSRS queries. It felt like a good low hanging fruit, which unfortunately gets overlooked.

Collapse
 
helenanders26 profile image
Helen Anderson

That’s great to hear. Did you have a friendly DBA to talk it through with?

Collapse
 
bengreenberg profile image
Ben Greenberg

In this case, I also became a bit of the friendly DBA. It's a small in house team, so everyone does a bit of everything!

Thread Thread
 
helenanders26 profile image
Helen Anderson

I feel like there are a lot of 'accidental DBAs' out there :)

Thread Thread
 
bengreenberg profile image
Ben Greenberg

That's probably true, and we're all just trying to be extra careful not to drop the database!

Thread Thread
 
helenanders26 profile image
Helen Anderson

Having all that power is a beautiful... and terrifying thing :)

Collapse
 
computersmiths profile image
ComputerSmiths

So if I’m (typically) putting measurements in a database with the Unix epoch time, and then pulling out the last 24 hours to plot, do I want to add an index on epoch? Is there an SQL “query” that’ll do that? I’m mostly using mysql and mariadb on Raspberry Pi (Debian-ish) if it matters. Thanks!

Collapse
 
katiekodes profile image
Katie

There's hardly anything to add to this awesome post, except maybe:

"I don't maintain the database; I just query it!" folks, don't be scared off!

Even if you don't have tools that show you the indexes, it's always good to ask your DBA (or some other expert) where the indexes are in tables you plan to join with SQL (or a drag-and-drop tool that generates SQL behind the scenes), so that you can potentially write faster queries.

I say "potentially" because the index won't always make your query run faster (it depends on the data and on your query). But that's the subject of a 3-hour university lecture and exercises doing "explain plan" with pencil and paper ... :-)

A good guess, if you're in a rush, is that the "primary key" for a table is often indexed to make "joining" it to other tables against "foreign keys" faster. In other words, join tables the way they seem intuitively related to each other, those joins may run reasonably quickly.

But ask your DBA or other database expert to be sure.

Collapse
 
helenanders26 profile image
Helen Anderson

Thanks Katie :)

Great advice to check in with your DBA before making those kinds of decisions. There's often a lot more going on behind the scenes and other indexes may have been applied for different reasons. An index may make things go faster, but there could be another way to solve the problem.

Collapse
 
vdedodev profile image
Vincent Dedo

What is the program you're using? I've never used anything that resembles those screenshots.

Collapse
 
helenanders26 profile image
Helen Anderson

Just good old SQL Server standard

Collapse
 
andy_preston profile image
Andy Preston

To be precise, the actual app in the screenshots looks like SQL Server Management Studio.

Collapse
 
mraubreycodes profile image
Aubrey Fletcher

Thank you so much for this insight into Indexes. It's well written and I am going to be using this as a resource when I need to come back to it!

Collapse
 
helenanders26 profile image
Helen Anderson

Thanks so much Aubrey :)

Collapse
 
vjnvisakh profile image
Visakh Vijayan

Cool

Collapse
 
yaser profile image
Yaser Al-Najjar

I'm definitely coming back to read this... I've read many articles about indexing and they just seem too technical forgetting the main point :\

Thankx Helen!

Collapse
 
helenanders26 profile image
Helen Anderson

You’re most welcome :)

Collapse
 
johand profile image
Johan

Another tip is make sargable queries to ensure that the DBMS engine will use indexes

Collapse
 
kingleo10 profile image
Niroj Dahal

Loved the way you used an example of library . That paragraph clarified overall purpose of article.

Collapse
 
helenanders26 profile image
Helen Anderson

Thanks! Glad you found it useful

Collapse
 
scottishross profile image
Ross Henderson

I use Oracle and was reading this just for reference, and then at the bottom, there's an Oracle resource. Great article Helen!

Collapse
 
helenanders26 profile image
Helen Anderson

Thanks Ross! There’s so much good stuff here. I’m glad I can build on the posts that have been written already.

Collapse
 
geshan profile image
Geshan Manandhar

Good post but indexes make the writes slow too :)