loading...
Cover image for Speed up your queries with indexes

Speed up your queries with indexes

helenanders26 profile image Helen Anderson Updated on ・4 min read

The classic analogy for indexes goes ... databases are like libraries.

Tables are like books stored in a library.

Rows are stored on pages of a book.

Flipping through a textbook page by page looking for that one page you need is going to take time. The same way scanning millions of rows is going be time-consuming and tedious. That's where indexes come in.

library


Why do I need an index?
Are there different types of indexes?
Where can I find my index once it's created?
What are good candidates for indexes?
Can't I do this later?
Do I need to do this at all?
How many indexes are too many?
This is no good to me, I use Oracle
Why didn't you explain everything about b-trees?
Is this the answer to all my performance problems?


Why do I need an index?

Indexes speed up performance by either ordering the data on disk so it's quicker to find your result or telling the SQL engine where to go to find your data. If you don't apply an index, the SQL engine will scan through every row one by one.

While this isn't necessarily a bad thing, as your database grows things could start to slow down.


Are there different types of indexes?

There are two main types in SQL Server:

Clustered Index - the contents page

  • Physically arranges the data on disk in a way that makes it faster to get to.
  • You can only apply one per table because the data can only be ordered one way.

indexes1

create clustered index [id_idx] --name of the index
on [dbo].[actor_registration](actor_id)

Non-clustered Index - the index at the back of a book.

  • These create a lookup that points to where the data is.
  • You can create up to 999 but as each index carries overhead and maintenance, you'll probably want to stick to just a few.

indexes2

create nonclustered index [last_name_idx] --name of the index
on [dbo].[actor_registration](last_name)

Where can I find my index once it's created?

You can find the indexes on a table by expanding the table where the index is, then expanding indexes.

indexes3

This is also where you can create an index using the wizard. In the example below the option for clustered index is now greyed out because we have one on this table already.

indexes4


What are good candidates for indexes?

ID columns, names, account numbers and others that have lots of changes. Ideally something unique, sequential that you are using in SELECTs and JOINs frequently.


Can't I do this later?

Sure, no problem. Sometimes it's good to get a handle on how you are querying the data and then add them later.

If you have no indexes on your table the data is stored in the order it comes in. This is called Heaped Storage and is effectively an expensive way of storing a spreadsheet.

Be aware indexes take time to apply if your tables are large by the time you get to this task it may take some time.


Do I need to do this at all?

There's no rule saying you should or shouldn't. The advantages of not adding indexes are that your INSERTs and UPDATEs will be faster and your database will be physically smaller.

If you do notice things getting slow, check out the Execution Plan for any suggestions and more information on where the effort is going to execute your query.

indexes5


How many indexes are too many?

As always, it depends. Too many indexes may slow down performance. Once you've created an index for your Primary Key and Unique Keys it will be up to you, the Execution Plan and perhaps your friendly DBA as to what you do next.


This is no good to me, I use Oracle

Lucky for you there's a great resource for you right here on Dev.to


Why didn't you explain everything about b-trees?

Because there's already some excellent content here on Dev.to that goes further on this topic and I'm looking to provide a beginners overview.


Is this the answer to all my performance problems?

Indexes need maintenance. They may improve performance initially, but need to be reviewed, updated and maintained as your database grows. They aren't a 'set and forget' magic bullet and should be reviewed, and deleted, as your requirements change.

Your best bet at first is to use the Execution Plan to view its suggestions or ask your friendly DBA to lend a hand.


Let me know what other key concepts you think would be useful for complete beginners and junior data analysts.


Read more


This post first appeared on helenanderson.co.nz

Posted on by:

helenanders26 profile

Helen Anderson

@helenanders26

Making applications go faster at Raygun, AWS Data Hero, and tag moderator on Dev.to.

Discussion

markdown guide
 

Keep in mind, that indexes are only useful if you have a high variance in your data, like username or firstname, lastname. Something that is limited in its variabce, like a enum field - category for example which holds only 5 possible categories isnt the best choice to add an index 😊👍

 

It's not always true. Selective index is good but if you not index on column in foreign key you may can get big problem with locks.

 

Can you explain this a little further? :)

Of course. For example table2 have column colF and foreign key to table1. If on colF not indexing and while table1 (any data) change we have exclusive table lock on table2. It's true for Oracle.
I have this problems many times.

Cool, thank for sharing! I guess its kind if database agnostic, good to know for oracle! ✌️😊

 

Great advice! That’s certainly what I do in the real world and more often than not it will be two or three columns not just one

 

That execution plan sounds like a great feature especially as it is nicely expressed as time. This would allow me to check whether my schema is optimized for some SQL requests, or simply to see if the SQL request structures I use are not too time consuming (that would be fine for my lab assessment, good grades come in handy!).

Do you know if this is available in other DBMS? Even if you wrote Oracle users weren't lucky here, I tried to check for that DBMS (because we are forced to use Oracle at my university) and I wasn't able to find a solution as easy as with SQL Server (expressed as seconds). Any solution for MySQL/MariaDB? If that matter, I'm used to use the CLI, but also GUI tools like DBeaver and SQL Developer.

 

Been a while, but I used to get the query execution plan in MySQL by simply putting EXPLAIN in front of it. I believe the same keyword is used in Postgres as well.

 

Great post, Helen. Really missing Execution Plan now that I use Postgres. EXPLAIN is... fine, but EP was so nice.

 

Good to know... I’m migrating to Postgres so it’ll be a bit of a change

 

It’s more an aspect of your tools, I know data grip can do some impressive stuff, if you can afford it

 

Presumably it's the same for MSSQL, but MySQL allows you to create indexes across multiple columns. This is very useful in a lot of situations where the identifying feature of the data is a combination of two or more things, e.g. user ID and a friend ID, where you'll likely have a row for each of the ID links.

 

Nice catch! In the real world I’ll create indexes across more than one column, as you’ve described. But in the interests of simplicity kept it to just one in the examples

 

There's hardly anything to add to this awesome post, except maybe:

"I don't maintain the database; I just query it!" folks, don't be scared off!

Even if you don't have tools that show you the indexes, it's always good to ask your DBA (or some other expert) where the indexes are in tables you plan to join with SQL (or a drag-and-drop tool that generates SQL behind the scenes), so that you can potentially write faster queries.

I say "potentially" because the index won't always make your query run faster (it depends on the data and on your query). But that's the subject of a 3-hour university lecture and exercises doing "explain plan" with pencil and paper ... :-)

A good guess, if you're in a rush, is that the "primary key" for a table is often indexed to make "joining" it to other tables against "foreign keys" faster. In other words, join tables the way they seem intuitively related to each other, those joins may run reasonably quickly.

But ask your DBA or other database expert to be sure.

 

Thanks Katie :)

Great advice to check in with your DBA before making those kinds of decisions. There's often a lot more going on behind the scenes and other indexes may have been applied for different reasons. An index may make things go faster, but there could be another way to solve the problem.

 

So if I’m (typically) putting measurements in a database with the Unix epoch time, and then pulling out the last 24 hours to plot, do I want to add an index on epoch? Is there an SQL “query” that’ll do that? I’m mostly using mysql and mariadb on Raspberry Pi (Debian-ish) if it matters. Thanks!

 

This is great! One of the first things I did at my job was create indexes and that dramatically sped up our SSRS queries. It felt like a good low hanging fruit, which unfortunately gets overlooked.

 

That’s great to hear. Did you have a friendly DBA to talk it through with?

 

In this case, I also became a bit of the friendly DBA. It's a small in house team, so everyone does a bit of everything!

I feel like there are a lot of 'accidental DBAs' out there :)

That's probably true, and we're all just trying to be extra careful not to drop the database!

Having all that power is a beautiful... and terrifying thing :)

 

What is the program you're using? I've never used anything that resembles those screenshots.

 

Just good old SQL Server standard

 

To be precise, the actual app in the screenshots looks like SQL Server Management Studio.

 

Good post but indexes make the writes slow too :)

 

Thank you so much for this insight into Indexes. It's well written and I am going to be using this as a resource when I need to come back to it!

 
 

Loved the way you used an example of library . That paragraph clarified overall purpose of article.

 

Thanks! Glad you found it useful

 
 

I'm definitely coming back to read this... I've read many articles about indexing and they just seem too technical forgetting the main point :\

Thankx Helen!

 
 

I use Oracle and was reading this just for reference, and then at the bottom, there's an Oracle resource. Great article Helen!

 

Thanks Ross! There’s so much good stuff here. I’m glad I can build on the posts that have been written already.

 

Another tip is make sargable queries to ensure that the DBMS engine will use indexes