Tuesday, January 13, 2015

I have Non- Clustered Index on My Table then Why My Query is Running Slow??

Yesterday, I was working on a database that was having some performance issue. I asked to one of team member that what type of problem they are facing in their database. He told me that some time query takes a lot of time to give result. Although we have already created indexes but still it takes much time, looks like there is something else which is degrading the performance of these queries. I asked him to expand a table and then index related to same in SSMS. He expanded a table and said see We have non-clustered index for this table on ID field. But when we try to run any query related to this table, it takes so much time. He ran a query and it was taking time. 

I told him that it’s not what you are thinking about Non-clustered Index. I explained him about Non-clustered index using execution plans that how non-clustered index work. And that’s where I got a topic for my blog post:

Although there are so many differences between clustered index and Non-Clustered Index but the point on which I’m going to focus is:

For a clustered index, the leaf page of the index is the same as the data page of the table. Therefore, when reading the values of the clustered index key columns, the database engine can also read the values of other columns without any navigation from the index row.  
But in case of Non-clustered index, it has only information about the column on which it is created. So, if a query refers a column which is not part of non-clustered index then it has to navigate from a non-clustered index row to the corresponding data row in the clustered index or the heap. This process is known as Bookmark Lookup or RID lookups.
So this is all about lookups which makes query slower in case of non-clustered index. If we have smaller number of rows in result then it will not having much impact but non-clustered index becomes ineffective as the number of rows retrieved increases.

Let’s create the environment then we will examine the same:
Use master
Go
Create database NCI
GO
use NCI
GO
CREATE TABLE TEST1 (ID INT IDENTITY ,NAME CHAR(40), AGE INT)
GO
INSERT INTO TEST1 VALUES ('VIMAL',34)
GO 199000

CREATE NONCLUSTERED INDEX ID ON TEST1 (ID)

Now let’s execute the below query then we will see the number of logical read and execution plan for same: Include the actual execution plan by pressing ctrl+M

SET STATISTICS IO ON -- Will display the number of logical and physical reads
SELECT * FROM TEST1 WHERE ID=5

 Number of logical reads: 6

Execution Plan:


You can see the output columns of these lookups. Just put your mouse cursor over this lookup as below:


So it is clear from the screenshot that the output column from this lookup is Name and Age (From total of 3 columns). As we have NON-CLUSTERED INDEX on column ID so it has the information only about ID. For other column it has to navigate to heap/cluster. So the 50% of total cost is used in just getting the other columns except ID. i.e. Name and Age. This is just an example as there is 50-50 % cost. But in production you can see the situation where all the cost is taken by these lookups.

Let’s create a cluster index on this table then we will see that this lookup gets changed or not.
CREATE  CLUSTERED INDEX Index_Age ON TEST1 (Age)

Now let’s execute the same query again:
SET STATISTICS IO ON -- Will display the number of logical and physical reads
SELECT * FROM TEST1 WHERE ID=5

Let’s see the execution plan:


From above screen shot we can see that RID lookup has been changed to Key Lookup. So, it is clear now that if we do not have any cluster index on table then it will do RID lookup.
Now run the below query. Please note that now we are only selecting the ID column

SET STATISTICS IO ON -- Will display the number of logical and physical reads
SELECT ID FROM TEST1 WHERE ID=5

Number of Logical reads: 3

Execution Plan:


Index seek with no look-up operator. Here we need only ID and this NON-CLUSTERED INDEX has information about this. So this does not required any look-up.

Resolving Lookups: 
There are different ways in which you can resolve these lookups which are as follows:

Using a Clustered Index:  You can read about this on MSDN. But just want to mention here that there may be situation where we have already clustered index and we cannot change as it’s very critical.
Using a Covering Index 
Using a Covering Index Using Include
Using an Index Join

You can go through above methods to resolve lookups. I will post about same in future post. For now in our example we will use INCLUDE columns in our NON-CLUSTERED INDEX and then will see the plan for same:

DROP INDEX [Index_id] ON [dbo].[TEST1] WITH ( ONLINE = OFF )
GO
USE [NCI]
GO
CREATE NONCLUSTERED INDEX [Index_id] ON [dbo].[TEST1]
(
      [ID] ASC
)
INCLUDE ( [NAME],
[AGE])

GO
Now lets execute the first query and see number of logical read and the query plan :

SET STATISTICS IO ON -- Will display the number of logical and physical reads
SELECT * FROM TEST1 WHERE ID=5


Number of logical read is reduced to 3 from 6. It was a test environment, in production these number will be so different.

Execution Plan:

 So, this was the answer to my friend who was asking "why my query is running slow although I’m having Non-Clustered index on that table". This all was because of lookups.  The lookup step associated with a non-clustered index can make data retrieval through a non-clustered index very costly. The SQL Server optimizer takes this into account when generating an execution plan, and if it finds the overhead cost of using a non-clustered index to be very high, it discards the index and performs a table scan (or a clustered index scan if the table is stored as a clustered index).


Therefore, to improve the effectiveness of a non-clustered index, it makes sense to analyze the cause of a lookup and consider whether you can avoid it completely by adding fields to the index key or to the INCLUDE column (or index join) and creating a covering index.

                             Thanks For Reading This Blog Post!!!

4 comments: