Thursday, 30 April 2015

Tracking Query Plan Changes

I was reading a blog post from Brent Ozar (SQL Server Query Store), which detailed Microsoft's announced future feature in SQL Server that aims to store a history of cached execution plans. The main purpose, it seems, of such a feature is to aid in performance troubleshooting and tuning. If something has changed recently and a query or store procedure begins behaving badly, there will be another tool in the DBA toolbelt to aid in finding what happened. Beyond that I'll let others explain it further.

I was inspired by a comment under the post that suggested one could set up such a store of his/her own. Aha, I thought! Why don't I try that. Combine that with the fact that April is quickly coming to a close and I have another blog post to write for the #SQLNewBlogger challenge... and voila!

Now, to caveat this: It is a work in progress and I welcome any and all input into how this could be developed - including any query improvements.

Now, you might ask, why would I want to do this if Microsoft has plans to release it. Well, for a few reasons: 1. They haven't committed to a release date, I think; 2. Chances are that I and many others won't get our hands on that new release for a long while; 3. I thought it would be a fun challenge.

The Solution

There are a few components involved in this mini-solution:
  1. A table to store the query plan data
  2. A query and scheduled job to collect the data
  3. A query to report on the changes
  4. A cleanup script to keep the query store table manageable

Step 1 - The Table

I've created the table below to store what I felt might be the relevant data for future execution plan investigation. I hope it is self explanatory.

Step 2 - Data Collection

The queries below do two things: 1. Collect the execution plans and upsert them to the above defined table; 2. Update the table with the object name (where available) of the object collected.

A few notes as to the logic of my data collection:
  • In order to be able to more easily self-join the queries, without having to compare long text strings, there are two checksum columns: one to represent the query string; the other to represent the execution plan xml. Here I ran into a curious issue in that often, even if the execution plan logic is the same, the checksum is calculated differently. This is down to the fact that compilation cpu and elapsed times may differ. Therefore, I have found it hard to accurately differentiate between a recompiled plan vs a truly different plan. 
  • Since the execution plans are held centrally for the whole server and the object names are referenced at the scope of the database, I needed to do this in multiple steps. The dynamic update query was modified from examples from Naomi Nosonovsky.
  • I have filtered my results to only collect user database plans and non-system objects. This is easily customised, so you can collect as many or as few query plans as you like. For instance, you may want to only track certain stored procedures or a subset of databases.

Step 3 - Report Query

This query will compare the rows within the QueryPlans table and display the following*:
  • Basic data
    • db id
    • Object ID if applicable
    • Object name if applicable
    • Query text
  • The three most recent:
    • Execution plans
    • Execution plan capture date (when the data collector query was run)
    • Plan_handles
  • Details of a changed object - in the case that a stored proc or function is modified
    • Date of change - modified date from sys.objects
    • Changed query text
    • Execution plan of changed object
*I am not 100% confident that all the data is displayed in the correct sequence. However, I feel that there is enough information to aid in highlighting any potential changes that may have occurred.

Step 4 - Cleanup Script

Since the reporting query above is written to display the most recent 3 execution plans for a given query, the cleanup script will delete the oldest record for any query that has more than three rows.

Conclusion

I hope that this has been an interesting post. If nothing else, writing it it has forced me to look deeper into how execution plans are held in SQL Server.

Please feel free to give me any feedback and tips for improvements. As I mentioned at the beginning, this is a first draft and a work in progress.


No comments: