Sql Server Multi Column Index Improvements

akashkava profile image Akash Kava ・2 min read

Multi Column Indexes

Multi column indexes are great way to improve queries that reduces join costs, but ideally more than 3 columns performs poor as it leads to more Data IO cost because each level increases size of row exponentially.

Consider an Index with (A,B,C) columns, if you see the statistics, you will notice that there are three different rows types, A , A,B and A,B,C and size of Row also increases drastically.

For very large table, more than three columns is really slow to update and search, as most queries end up seeking different pages.

Multi column with Included columns

If SQL Server suggests you to create index of more than 3 columns, you can put other columns in included part, included columns do not increase size as they are stored as it is, it is not stored as indexed columns. If you look at query plan, filter on included column will be performed in one step without performing join on clustered index.

Sql server will still recommend creating index on all columns, but you will find that average cost of updating index and querying index turns out to be better with more included and less indexed columns.

How many columns?

3 Columns is not ideal, You must run your tests against high CPU/Data cost queries and find out the best optimum number of columns. Some queries must perform best with just 2 and some may need 4.

Which 3 columns?

Limited value range such as single bit (2 values), nullable bit (3 values), enum columns such as limited set of (open/closed.. any strings) can be ignored and can be put in included columns, reason being, for any combination of indexed columns, there will only be very few rows to fetch and test. Only extremely large range of items should be put in indexed columns. Rest can be easily moved to indexed columns. Again, test it against your queries to find out the best combination. You must understand the kind of values stored in columns to find best combination as well. For example, if SQL Server suggest building index of A,B,C and C is a boolean type (max 2 possibilities), you can safely ignore C if combination of A,B leads to only 1 - 20 rows. The reason it is safe as sql server will not need to span additional pages as all rows will be on one single page to run filter.

Index Storage

Smaller the indexes, easier it is to update and indexes use more storage than actual table. Sql server's bacpac format does not backup indexed, so upon testing I found out that for 100GB of database size, bacpac is only of size 3GB.

Also index fragmentation may add up to total size of database, these days with nearly unlimited storage, we don't care about size, but the size impacts many other operations. As CPU has limited RAM, and SSD storage is also limited in size, reducing index size will greatly improve overall speed of database.

It is necessary to rebuild indexes over 30% fragmentation.


Editor guide