Written by 07:00 Database development, Stored Procedures

Parameter Sniffing Primer

Introduction

Developers are often told to use stored procedures in order to avoid the so-called ad hoc queries which can result in unnecessary bloating of the plan cache. You see, when recurrent SQL code is written inconsistently or when there’s code that generates dynamic SQL on the fly, SQL Server has a tendency to create an execution plan for each individual execution. This may decrease overall performance by:

  1. Demanding a compilation phase for every code execution.

  2. Bloating the Plan Cache with too many plan handles that may not be reused.

Optimize for Ad Hoc Workloads

One way this problem was handled in the past is Optimizing the instance for Ad Hoc Workloads. Doing this can only be helpful if most databases or most significant databases on the instance are predominantly executing Ad Hoc SQL.

Optimize For Ad Hoc Workloads

Fig. 1 Optimize for Ad Hoc Workloads

--Enable OFAW Using T-SQL 

EXEC sys.sp_configure N'show advanced options', N'1'  RECONFIGURE WITH OVERRIDE
GO
EXEC sys.sp_configure N'optimize for ad hoc workloads', N'1'
GO
RECONFIGURE WITH OVERRIDE
GO
EXEC sys.sp_configure N'show advanced options', N'0'  RECONFIGURE WITH OVERRIDE
GO

Essentially, this option tells SQL Server to save a partial version of the plan known as the compiled plan stub. The stub occupies much less space than the entire plan.

As an alternative to this method, some people approach the issue rather brutally and flush the plan cache every now and then. Or, in a more careful way, flush “single-use plans” by using DBCC FREESYSTEMCACHE. Flushing the entire plan cache has its downsides, as you may already know.

Using Stored Procedures and Parameters

By using stored procedures, one can virtually eliminate the problem caused by Ad Hoc SQL. A stored procedure is compiled only once and the same plan is reused for subsequent executions of the same or similar SQL queries. When stored procedures are used to implement business logic, the key difference in the SQL queries that will be eventially executed by SQL Server lies in the parameters passed at execution time. Since the plan is already in place and ready for use, SQL Server will use the same plan no matter what parameter is passed.

Skewed Data

In certain scenarios, the data we are dealing with is not distributed evenly. We can demonstrate this – first, we will need to create a table:

--Create Table with Skewed Data
use Practice2017
go
create table Skewed (
ID int identity (1,1)
, FirstName varchar(50)
, LastName varchar(50)
, CountryCode char(2)
);

insert into Skewed values ('Kwaku','Amoako','GH')
go 10000
insert into Skewed values ('Kenneth','Igiri','NG')
go 10
insert into Skewed values ('Steve','Jones','US')
go 2

create clustered index CIX_ID on Skewed(ID);
create index IX_CountryCode on Skewed (CountryCode);

Our table contains data of club members from different countries. A large number of club members are from Ghana, while two other nations have ten and two members respectively. To keep focused on the agenda and for simplicity’s sake, I only used three countries and the same name for members coming from the same country country. Also, I added a clustered index in the ID column and a non-clustered index in the CountryCode column to demonstrate the effect of different execution plans for different values.

Execution Plan For Two Queries

Fig. 2 Execution plans for two queries

When we query the table for records where CountryCode is NG and GH, we find that SQL Server uses two different execution plans in these cases. This happens because the expected number of rows for CountryCode=NG’ is 10, while that for CountryCode=’GH’ is 10000. SQL Server determines the preferable execution plan based on table statistics. If the expected number of rows is high compared with the total number of rows in the table, SQL Server decides that it is better to simply do a full table scan rather than referring to an index. With a much smaller estimated number of rows, the index becomes useful.

Estimated Number Of Rows 1

Fig. 3 Estimated number of rows for CountryCode=’NG’

Estimated Number Of Rows 2

Fig. 4 Estimated number of Rows for CountryCode=GH

Enter Stored Procedures

We can create a stored procedure to fetch the records we want by using the very same query. The only difference this time is that we pass CountryCode as a parameter (see Listing 3). When doing this, we discover that the execution plan is the same no matter what parameter we pass. The Execution plan that will be used is determined by the execution plan returned at the first time the stored procedure is invoked. For example, If we run the procedure with CountryCode=’GH’ first, it will use a full table scan from that point on. If we then clear the procedure cache and run the procedure with CountryCode=’NG’ first, it will use index-based scans in future.

--Create a Stored Procedure to Fetch the Data
use Practice2017
go
select * from Skewed where CountryCode='NG';
select * from Skewed where CountryCode='GH';

create procedure FetchMembers 
(
@countrycode char(2)
)
as 
begin
select * from Skewed where CountryCode=@countrycode
end;


exec FetchMembers 'NG';
exec FetchMembers 'GH';

DBCC FREEPROCCACHE
exec FetchMembers 'GH';
exec FetchMembers 'NG';

Index-based Execution Plan

Fig. 5 Index seek execution plan when ‘NG’ is used first

Clustered Index Scan

Fig. 6 Clustered index scan execution plan when GHis used first

Execution of the stored procedure is behaving as designed – the required execution plan is used consistently. However, this can be an issue because one execution plan is not suited for all queries if the data is skewed. Using an index to retrieve a collection of rows almost as large as the entire table is not efficient – neither is using a full scan to retrieve only a small number of rows. This is the Parameter Sniffing problem.

Possible Solutions

One common way to manage the Parameter Sniffing problem is to deliberately invoke recompilation whenever the stored procedure is executed. This is much better than flushing the Plan Cache – except if you want to flush the cache of this specific SQL query, which is entirely possible. Take a look at an updated version of the stored procedure. This time, it uses OPTION (RECOMPILE) to manage the problem. Fig.6 shows us that, whenever the new stored procedure is executed, it uses a plan appropriate to the parameter we are passing.

--Create a New Stored Procedure to Fetch the Data
create procedure FetchMembers_Recompile
(
@countrycode char(2)
)
as 
begin
select * from Skewed where CountryCode=@countrycode OPTION (RECOMPILE)
end;

exec FetchMembers_Recompile 'GH';
exec FetchMembers_Recompile 'NG';

Behavior With OPTION(RECOMPILE)

Fig. 7 Behaviour of the stored procedure with OPTION (RECOMPILE)

Conclusion

In this article, we have looked at how consistent execution plans for stored procedures can become a problem when the data we are dealing with is skewed. We have also demonstrated this in practice and learned about a common solution to the problem. I dare say this knowledge is invaluable to developers who use SQL Server. There are a number of other solutions to this problem – Brent Ozar went deeper into the subject and highlighted some more profound details and solutions at SQLDay Poland 2017. I have listed the corresponding link in the reference section.

References

Plan Cache and Optimizing for Adhoc Workloads

Identifying and Fixing Parameter Sniffing Issues

Tags: , , Last modified: September 21, 2021
Close