Tuesday 12 March 2013

Remove a Cursor From a Stored Procedure - Part 2

Why is it so slow?!?!?
In my last post, I used a stored procedure from a database I manage to illustrate a real world example of converting a cursor into set based SQL. In this post I will look into a the mechanics behind the scenes and explain at least one of the reasons why it is generally unwise to use cursors when a set based query will provide the same results.

For those that are new to cursors let me briefly explain what happens within a cursor. The cursor is broken up into four main parts: DECLARE, OPEN, FETCH and PROCESS . In the DECLARE section you define the result set that will be looped through to provide values for the PROCESS section. When you OPEN the cursor you are 'running' the select statement defined in the DECLARE section and mapping the variables used in the PROCESS section. The FETCH grabs the next row from the opened result set and sets the variables. And finally, the PROCESS section performs all the various tasks needed within your batch, function, stored procedure or trigger ('process' is not a T-SQL command, I'm just using it as a convenient keyword).

In the databases I manage, virtually all cursors are STATIC cursors. So I will limit my discussion to those. When you open a static cursor the database engine retrieves the result set and places it into memory and/or TempDB. Systems with limited memory resources will struggle and end up using TempDB more, since if the result set being used is larger than available memory it will be dumped into TempDB. This is very similar behaviour to a large table variable.

But the main issue I have with cursors is the pressure placed on all the resources due to its row-by-row nature. SQL Server has been designed to optimise its performance for fetching sets of data from tables. It develops execution plans that it deems the most efficient for grabbing those sets of data with the least amount of resource (CPU, memory and disk) overhead.

What the cursor does is prevent SQL Server from being able to do that. The cursor forces the database engine to execute the query's logic (the PROCESS section) by feeding it (the FETCH section) the source data (the DECLARE and OPEN sections) one row at a a time. This then translates to many more trips to the disk or memory than if SQL Server was allowed to process the whole result set in one go.

This is best described by looking at the results of the STATISTICS IO and TIME options (disk, memory activity and CPU, elapsed time respectively). I've put together a spreadsheet comparing the two metrics side by side: Cursor Replacement comparison.xls
 
There are two things to notice in the comparison. The first, and rather obvious, is that the cursor version of the stored procedure requires more disk access and CPU time than the set based version. You'll also notice that the scan count is higher in the cursor version. And while scan count by itself doesn't define good or bad performance. I think in this example it helps support the idea that a cursor requires a greater amount of resource to perform the same job as a set based query.

Fortunately for the system users, but a bit unfortunately for this example, the database is well tuned for this query as there are no physical reads occurring. In the case that there were physical reads, i.e. the disk was accessed, and the pattern followed the logical reads (memory reads), then the performance difference between the two query versions would be much more significant.

The example I've used is not drastically different between the two versions. But if you imagine an increasing data set, the divide will keep expanding, with the cursor taking longer and requiring more and more disk and CPU resource to complete.

The second, and somewhat surprising to me, is that as the cursor iterates through the logical reads increase for the same operation. If you look at the spreadsheet you'll see that with each iteration the logical reads for a specific table (starting at the spreadsheet row 232) start at 70 and increase by 2 for each iteration. Finally requiring 90 logical reads for the last iteration. Once again, imagine the data set is 10 or even 100 times larger and you can easily see how performance will plummet!

There are lots of articles out there discussing the pros and cons of cursors and, hopefully, mine has added to the overall discussion. Please feel free to leave comments and let me know if I've missed something or if anything I've said is incorrect or unclear.

No comments: