My Humble SQL Tips: Querying the Execution Plan for Subtree Cost

A few weeks ago I shared a query for probing the execution plan for the Estimated Subtree Cost value.

That query was rather simple but it got the job done. In that post I also explained how, although there are several ways of finding missing index suggestions, they were not always very helpful. After posting I was thinking that it would be helpful to combine the two, or at least display the results so that it is easy to determine which execution plans include missing index warnings.

I found the following query in a post by Jonathan Kehayias on sqlskills.com. It is a good way of not only finding execution plans with missing index warnings, but displaying the index recommendations within.

WITH XMLNAMESPACES
   (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
 
SELECT query_plan,
       n.value('(@StatementText)[1]', 'VARCHAR(4000)') AS sql_text,
       n.value('(//MissingIndexGroup/@Impact)[1]', 'FLOAT') AS impact,
       DB_ID(REPLACE(REPLACE(n.value('(//MissingIndex/@Database)[1]', 'VARCHAR(128)'),'[',''),']','')) AS database_id,
       OBJECT_ID(n.value('(//MissingIndex/@Database)[1]', 'VARCHAR(128)') + '.' +
           n.value('(//MissingIndex/@Schema)[1]', 'VARCHAR(128)') + '.' +
           n.value('(//MissingIndex/@Table)[1]', 'VARCHAR(128)')) AS OBJECT_ID,
       n.value('(//MissingIndex/@Database)[1]', 'VARCHAR(128)') + '.' +
           n.value('(//MissingIndex/@Schema)[1]', 'VARCHAR(128)') + '.' +
           n.value('(//MissingIndex/@Table)[1]', 'VARCHAR(128)')
       AS statement,
       (   SELECT DISTINCT c.value('(@Name)[1]', 'VARCHAR(128)') + ', '
           FROM n.nodes('//ColumnGroup') AS t(cg)
           CROSS APPLY cg.nodes('Column') AS r(c)
           WHERE cg.value('(@Usage)[1]', 'VARCHAR(128)') = 'EQUALITY'
           FOR  XML PATH('')
       ) AS equality_columns,
        (  SELECT DISTINCT c.value('(@Name)[1]', 'VARCHAR(128)') + ', '
           FROM n.nodes('//ColumnGroup') AS t(cg)
           CROSS APPLY cg.nodes('Column') AS r(c)
           WHERE cg.value('(@Usage)[1]', 'VARCHAR(128)') = 'INEQUALITY'
           FOR  XML PATH('')
       ) AS inequality_columns,
       (   SELECT DISTINCT c.value('(@Name)[1]', 'VARCHAR(128)') + ', '
           FROM n.nodes('//ColumnGroup') AS t(cg)
           CROSS APPLY cg.nodes('Column') AS r(c)
           WHERE cg.value('(@Usage)[1]', 'VARCHAR(128)') = 'INCLUDE'
           FOR  XML PATH('')
       ) AS include_columns
INTO #MissingIndexInfo
FROM
(
   SELECT query_plan
   FROM (
           SELECT DISTINCT plan_handle
           FROM sys.dm_exec_query_stats WITH(NOLOCK)
         ) AS qs
       OUTER APPLY sys.dm_exec_query_plan(qs.plan_handle) tp
   WHERE tp.query_plan.exist('//MissingIndex')=1
) AS tab (query_plan)
CROSS APPLY query_plan.nodes('//StmtSimple') AS q(n)
WHERE n.exist('QueryPlan/MissingIndexes') = 1;
 
-- Trim trailing comma from lists
UPDATE #MissingIndexInfo
SET equality_columns = LEFT(equality_columns,LEN(equality_columns)-1),
   inequality_columns = LEFT(inequality_columns,LEN(inequality_columns)-1),
   include_columns = LEFT(include_columns,LEN(include_columns)-1);
 
SELECT *
FROM #MissingIndexInfo;
 
DROP TABLE #MissingIndexInfo;

Obviously, I wanted to combine the above with my desire to find execution plans with an Estimated Subtree Cost of greater than x. So, I came up with the stored procedure below. I decided to make the query a stored procedure. As I was testing it I noticed that this query was actually becoming one of the most resource intensive queries. I obviously wanted to exclude it and the easiest way was to create it as a stored procedure and then delete any rows with this stored procedure in the ObjectName column of the temp table. Regular readers of my blog (are there any?) may remember that my team maintains a DBA database. In my current role we call it SYSDBA. It holds a number of stored procedures, functions and tables that are used for maintenance tasks and other regularly used queries that the DBA team needs. This stored proc will now enter our arsenal.

Let me restate that my XML query skills are virtually non-existent. The query below was cobbled together with a lot of trial and error. I welcome any tips and feedback on making it tidier and more efficient. This query will take a long time to run on large and/or busy databases.

Finally, let me describe how to run the stored procedure and the result set returned.
The stored procedure has only 1 parameter: @Cost which is an integer. This is the lower threshold for the EstimatedSubtreeCost value. The stored procedure will find all execution plans that contain an EstimatedTotalSubtreeCost value of greater than @Cost.

The stored procedure's result set is as follows:

Column	Description
query_plan	The full XML execution plan.
plan_handle	The identity of the execution plan.
cost	The Estimated Subtree Cost of at least one section of the plan.
DBName	The database name from where this query is run. It may not be the database where that the query affects. For example, if the stored proc is stored on DB1 and a statement within it access DB2, DB1 will be displayed.
objectid	If the query is a stored procedure an id will be displayed otherwise it will be null.
param_list	Declare statements for all the parameters listed within the execution plan.
sql_text	The sql statement or stored procedure within the execution plan.
Index_impact	Impact of any missing index. The value will be NULL if no index is suggested.
Index_database_id	The database id where for the missing index. The value will be NULL if no index is suggested.
Index_object_id	The object id (Table) for any missing index. The value will be NULL if no index is suggested.
Index_statement	The SQL statement that for the missing index. The value will be NULL if no index is suggested. The value will most probably be the same as the sql_text column value.
Index_equality_columns	The list of equality columns for any missing index. The value will be NULL if no index is suggested.
Index_inequality_columns	The list of inequality columns of any missing index. The value will be NULL if no index is suggested or if there are no inequality columns.
Index_include_columns	The list of included columns of any missing index. The value will be NULL if no index is suggested or if there are no included columns.

Finally, here is the stored procedure:

USE [SYSDBA];
GO
IF EXISTS
(
    SELECT 1
    FROM [sys].[procedures]
    WHERE [name] = 'FindExpensiveExecutionPlans'
)
    DROP PROC [dbo].[FindExpensiveExecutionPlans];
GO
/***********************************************************************************
Created by: Thomas Mucha
Created: 15/03/2018
Modified by:
Modified:
Description: Returns the XML execution plan for any plan that has a estimated subtree cost
    higher than the set @Cost parameter value. The default is 100
***********************************************************************************/
CREATE PROC [dbo].[FindExpensiveExecutionPlans]
(@Cost INT = 100)
AS
BEGIN
    SET NOCOUNT ON;
    WITH XMLNAMESPACES
    (
        DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'
    )
    SELECT [tab].[query_plan],
           [tab].[plan_handle],
           DB_NAME([dbid]) AS [DBName],
           OBJECT_NAME([objectid]) AS [ObjectName],
           [objectid],
           [n].[value]('(//StmtSimple/@StatementText)[1]', 'VARCHAR(4000)') AS [sql_text],
           [n].[value]('(//MissingIndexGroup/@Index__impact)[1]', 'FLOAT') AS [Index_impact],
           DB_ID(REPLACE(REPLACE([n].[value]('(//MissingIndex/@Database)[1]', 'VARCHAR(128)'), '[', ''), ']', '')) AS [Index_database_id],
           (
               SELECT DISTINCT
                      'DECLARE ' + ISNULL([c].[value]('(@Column)[1]', 'VARCHAR(128)'), '') + ' ['
                      + ISNULL([c].[value]('(@ParameterDataType)[1]', 'VARCHAR(128)'), '') + ']' + ' = '
                      + ISNULL([c].[value]('(@ParameterCompiledValue)[1]', 'VARCHAR(MAX)'), '') + '; '
               FROM [n].nodes('//ParameterList') AS [t]([cg])
                   CROSS APPLY [cg].nodes('ColumnReference') AS [r]([c])
               FOR XML PATH('')
           ) AS [param_list],
           OBJECT_ID([n].[value]('(//MissingIndex/@Database)[1]', 'VARCHAR(128)') + '.'
                     + [n].[value]('(//MissingIndex/@Schema)[1]', 'VARCHAR(128)') + '.'
                     + [n].[value]('(//MissingIndex/@Table)[1]', 'VARCHAR(128)')
                    ) AS [Index_OBJECT_ID],
           [n].[value]('(//MissingIndex/@Database)[1]', 'VARCHAR(128)') + '.'
           + [n].[value]('(//MissingIndex/@Schema)[1]', 'VARCHAR(128)') + '.'
           + [n].[value]('(//MissingIndex/@Table)[1]', 'VARCHAR(128)') AS [Index_statement],
           (
               SELECT DISTINCT
                      [c].[value]('(@Name)[1]', 'VARCHAR(128)') + ', '
               FROM [n].nodes('//ColumnGroup') AS [t]([cg])
                   CROSS APPLY [cg].nodes('Column') AS [r]([c])
               WHERE [cg].[value]('(@Usage)[1]', 'VARCHAR(128)') = 'EQUALITY'
               FOR XML PATH('')
           ) AS [Index_equality_columns],
           (
               SELECT DISTINCT
                      [c].[value]('(@Name)[1]', 'VARCHAR(128)') + ', '
               FROM [n].nodes('//ColumnGroup') AS [t]([cg])
                   CROSS APPLY [cg].nodes('Column') AS [r]([c])
               WHERE [cg].[value]('(@Usage)[1]', 'VARCHAR(128)') = 'INEQUALITY'
               FOR XML PATH('')
           ) AS [Index_inequality_columns],
           (
               SELECT DISTINCT
                      [c].[value]('(@Name)[1]', 'VARCHAR(128)') + ', '
               FROM [n].nodes('//ColumnGroup') AS [t]([cg])
                   CROSS APPLY [cg].nodes('Column') AS [r]([c])
               WHERE [cg].[value]('(@Usage)[1]', 'VARCHAR(128)') = 'INCLUDE'
               FOR XML PATH('')
           ) AS [Index_include_columns],
           [n].[value]('(@EstimatedTotalSubtreeCost)[1]', 'FLOAT') AS [Cost]
    INTO [#MissingIndexInfo]
    FROM
    (
        SELECT [query_plan],
               [plan_handle],
               [dbid],
               [objectid]
        FROM
        (
            SELECT DISTINCT
                   [plan_handle]
            FROM [sys].[dm_exec_query_stats] WITH (NOLOCK)
        ) AS [qs]
            OUTER APPLY [sys].dm_exec_query_plan([qs].[plan_handle]) [tp]
        WHERE [tp].[query_plan].[exist]('//RelOp[
         @EstimatedTotalSubtreeCost > sql:variable("@Cost")
           ]') = 1
    ) AS [tab]([query_plan], [plan_handle], [dbid], [objectid])
        CROSS APPLY [query_plan].nodes('//RelOp') AS [q]([n])
    WHERE [n].[value]('(@EstimatedTotalSubtreeCost)[1]', 'FLOAT') > @Cost;

    CREATE CLUSTERED INDEX [IX_MissingIndexInfo_planhandle]
    ON [#MissingIndexInfo] ([plan_handle]);

    DELETE FROM [#MissingIndexInfo]
    WHERE [ObjectName] = 'FindExpensiveExecutionPlans';

    -- Trim trailing comma from lists
    UPDATE [#MissingIndexInfo]
    SET [Index_equality_columns] = LEFT([Index_equality_columns], LEN([Index_equality_columns]) - 1),
        [Index_inequality_columns] = LEFT([Index_inequality_columns], LEN([Index_inequality_columns]) - 1),
        [Index_include_columns] = LEFT([Index_include_columns], LEN([Index_include_columns]) - 1);

    SELECT CONVERT(XML, [b].[query_plan]) AS [query_plan],
           [a].[plan_handle],
           [cost],
           [DBName],
           [objectid],
           [param_list],
           [sql_text],
           [Index_impact],
           [Index_database_id],
           [Index_OBJECT_ID],
           [Index_statement],
           [Index_equality_columns],
           [Index_inequality_columns],
           [Index_include_columns]
    FROM
    (
        SELECT [DBName],
               [objectid],
               [plan_handle],
               [sql_text],
               [Index_impact],
               [param_list],
               [Index_database_id],
               [Index_OBJECT_ID],
               [Index_statement],
               [Index_equality_columns],
               [Index_inequality_columns],
               [Index_include_columns],
               MAX([Cost]) AS [cost]
        FROM [#MissingIndexInfo]
        GROUP BY [DBName],
                 [objectid],
                 [plan_handle],
                 [sql_text],
                 [Index_impact],
                 [param_list],
                 [Index_database_id],
                 [Index_OBJECT_ID],
                 [Index_statement],
                 [Index_equality_columns],
                 [Index_inequality_columns],
                 [Index_include_columns]
    ) AS [a]
        JOIN
        (
            SELECT DISTINCT
                   CONVERT(VARCHAR(MAX), [query_plan]) AS [query_plan],
                   [plan_handle]
            FROM [#MissingIndexInfo]
        ) AS [b]
            ON [a].[plan_handle] = [b].[plan_handle]
    ORDER BY [cost] DESC;

    DROP TABLE [#MissingIndexInfo];

END;

You can download it from my Google Drive as well.

My Humble SQL Tips

Thursday 15 March 2018

Querying the Execution Plan for Subtree Cost - Take 2

1 comment: