DEV Community

SQL Database Index Basics

Lawrence Cooke on January 04, 2024

If you are just starting out on your SQL database journey, you may be wondering what indexes are and why we care so much about them. While the su...
Collapse
 
joolsmcfly profile image
Julien Dephix

Nice post, Lawrence!

Adding too many indices can impact performance and disk space too.

I do have one question about this query and why its matching rows column in EXPLAIN is 10:

SELECT * FROM employees 
WHERE first_name = 'Mary'
ORDER BY emp_no
LIMIT 10
Enter fullscreen mode Exit fullscreen mode

rows is an estimated number of rows needed to be scanned so it could be off by a large margin.
Nothing tells the DB engine that employees named Mary are in the first 10 rows when sorted by emp_no. You could have 10000 John's or Steve's or Sarah's in which case the DB engine will have to scan them all.

Any insights?

Collapse
 
artxe2 profile image
Yeom suyun

In the actual execution plan, the "select type" is set to SIMPLE.
In this case, it scans until there are 10 rows passing through the WHERE clause, sorted by the primary key.
However, since the data in the database is already sorted by the primary key, the ORDER BY clause is unnecessary.

Collapse
 
joolsmcfly profile image
Julien Dephix • Edited

Hmmm, your comment is valid if emp_no is indeed the primary key. Unless I am mistaken this post doesn't tell us about the table structure. So emp_no could just be a unique index.

Anyhow, I'd like to know where that rows=10 comes from. I understand it scans until it finds 10 rows that match the condition but why does it think it needs to scan only 10 rows? It could need to scan 900 rows cos emp_no is unrelated to first_name.

Thread Thread
 
mrpercival profile image
Lawrence Cooke • Edited

The 10 would be the minimum possible rows it can look at to get the results you are after. In the case of the LIMIT 10,100 its shown as 110 as it knows it needs to find at least 110 rows before it will find enough rows to show you. In reality the number of rows will be larger than that, its unlikely that the first 10 rows are going to match the criteria, but it is possible and its that "possible" that EXPLAIN is returning.

Collapse
 
artxe2 profile image
Yeom suyun

Using an index doesn't always guarantee an improvement in query performance.
In fact, it can even be slower than a full scan, even when the optimizer expects the index to be faster.

Collapse
 
mrpercival profile image
Lawrence Cooke

yes, you always need to weigh up adding an index. its not always the right answer, its always worth testing both ways and see which way is more performant.