Posts belonging to Category DMO



Searching for Plans

70999185_58c1ad9d35

A while back I wrote a post on a procedure I created for the DBADiagnostics database that I talk about from time to time.  This procedure allowed users to search the procedure cache to find plans for procedures by database name and object name.  This had worked out pretty good until I noticed something the other day.

The procedure is using the sys.dm_exec_query_stats as the based table for the query to build itself upon.  The trouble with using this DMV is that it doesn’t have all of the plans listed from the procedure cache.  Only those that currently have stats stored for sql_handles.

About the same time I started writing this post, I noticed a post by Adam Machanic (twitter | blog) in which he puts out a warning to those using sys.dm_exec_query_stats.  In that post, he discusses how the ALTER TABLE statement can result in a batch missing this DMV.

Considering all this lends itself to the question, is there a better DMV that can be used to find plans in the procedure cache?

Use sys.dm_exec_cached_plans Instead

Another DMV that has this information in it is the DMV sys.dm_exec_cached_plans.  This DMV is designed to provide a list of all cached plans in the procedure cache.  To illustrate the difference, lets take a look at the following two queries.


SELECT COUNT(DISTINCT plan_handle)
FROM sys.dm_exec_cached_plans
GO 

SELECT COUNT(DISTINCT plan_handle)
FROM sys.dm_exec_query_stats
GO

On my system, the following results are returned:

image

The first value is the number of plan_handles in sys.dm_exec_cached_plans.  The second is the number of plan_handles in sys.dm_exec_query_stats.  There is a substantial difference between the two.  This is something you’d be expecting based on the information at the start of this post.

Query to Find Cached Plans

Now that I’ve outlined the issues with my original post, lets take a look at a script that will query the procedure cache to look for plans stored for stored procedures and other database objects.


SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
GO

DECLARE @DatabaseName sysname
    ,@ObjectName sysname 

SELECT @DatabaseName = 'msdb'
    ,@ObjectName = 'sp_jobhistory_row_limiter';

WITH PlanSearch
AS (
      SELECT qp.dbid
            ,qp.objectid
            ,DB_NAME(qp.dbid) as DatabaseName
            ,OBJECT_NAME(qp.objectid, qp.dbid) as ObjectName
            ,cp.usecounts
            ,cp.plan_handle
      FROM sys.dm_exec_cached_plans cp
            CROSS APPLY sys.dm_exec_text_query_plan(cp.plan_handle, DEFAULT, DEFAULT) qp
    WHERE cp.cacheobjtype = 'Compiled Plan'
      AND (DB_NAME(qp.dbid) = @DatabaseName OR NULLIF(@DatabaseName,'') IS NULL)
    AND (OBJECT_NAME(qp.objectid, qp.dbid) = @ObjectName OR NULLIF(@ObjectName, '') IS NULL)
)
,PlansAndStats
AS (
      SELECT ps.DatabaseName
        ,ps.ObjectName
        ,ps.usecounts -- Use in place of qs.execution_count for whole plan count
        ,CAST(SUM(qs.total_worker_time)/(ps.usecounts*1.) as decimal(12,2)) AS avg_cpu_time
        ,CAST(SUM(qs.total_logical_reads + qs.total_logical_writes)/(ps.usecounts*1.) as decimal(12,2)) AS avg_io
        ,SUM(qs.total_elapsed_time)/(ps.usecounts)/1000 as avg_elapsed_time_ms
        ,ps.plan_handle
    FROM PlanSearch ps
            LEFT OUTER JOIN sys.dm_exec_query_stats qs ON ps.plan_handle = qs.plan_handle
    GROUP BY ps.DatabaseName
        ,ps.ObjectName
        ,ps.usecounts
            ,ps.plan_handle
)
SELECT ps.DatabaseName
    ,ps.ObjectName
      ,ps.usecounts
      ,ps.avg_cpu_time
      ,ps.avg_io
      ,ps.avg_elapsed_time_ms
      ,qp.query_plan
      ,ps.plan_handle
FROM PlansAndStats ps
      CROSS APPLY sys.dm_exec_query_plan(ps.plan_handle) qp 

Executing the script above provides the following output:

image

The output includes the name of the object being sought and a link to the plan from the procedure cache.  Included with this is some basic performance information from sys.dm_exec_query_stats that can be useful in determining performance variations between the plans returned.

Procedure to Find Plans

For those that like to take these scripts and build stored procedures for them.  I’ve also included a script below that includes this information.

Really Search Cache for Execution Plans


hide

Today’s stored procedure for the DBADiagnostics database is a re-hash of a post I did a couple months ago.  The reason for re-hashing the procedure is that when I used it last week I found out there were some serious issues with its results.  Some plans were being duplicated and the performance statistics weren’t accurate because of this.  And why not fix the issues and make a DBADiagnostics post out of it.

In the previous version, I was using the execution count from sys.dm_exec_query_stats to determine how often a plan was being executed.  In the DMV though the SQL statements in the may execute in varying counts depending on conditional logic.

Instead of using the execution count from sys.dm_exec_query_stats the procedure now uses usecount from sys.dm_exec_cached_plans; which represents how often the plan was used.  The execution count represented how often the statements in the plan were executed.

The stored procedure accepts the following parameters:

  • @Database: The database to search for plans within.  This value can be NULL.
  • @ObjectName: The name of the procedure to search for plans for.  This value can be NULL.

The procedure can use either or none of the variables.  Obviously with both variables empty the results will be for all cached plans on the server.

USE [DBADiagnostics]
GO

IF NOT EXISTS(SELECT * FROM sys.schemas WHERE name = 'Utility')
    EXEC('CREATE SCHEMA [Utility] AUTHORIZATION [dbo]')
GO

IF EXISTS(SELECT * FROM sys.procedures WHERE object_id = OBJECT_ID('Utility.CachedPlanSearch'))
    DROP PROCEDURE [Utility].[CachedPlanSearch]
GO

/*============================================================
Procedure:    [Utility].[CachedPlanSearch]
Author:       Jason Strate
Date:         June 1, 2009

Synopsis:
    Searches cache for all occurances of a plan based on an
    procedure and or database name.  Results assumes that average
    execution time for returned results are less than 24 hours.

Exec Utility.CachedPlanSearch
    @DatabaseName = 'msdb'
    ,@ObjectName = 'sp_sqlagent_get_startup_info'

============================================================
Revision History:
Date:         By                Description
------------------------------------------------------------

============================================================*/
CREATE PROCEDURE Utility.CachedPlanSearch
    (
    @DatabaseName sysname = NULL
    ,@ObjectName sysname = NULL
    )
AS

;WITH cteExecInfo
AS (
    SELECT DB_NAME(st.dbid) AS database_name
        ,OBJECT_NAME(st.objectid, st.dbid) AS object_name
        ,cp.usecounts -- Use in place of qs.execution_count for whole plan count
        ,CAST(SUM(qs.total_worker_time)/(cp.usecounts*1.) as decimal(12,2)) AS avg_cpu_time
        ,CAST(SUM(qs.total_logical_reads + qs.total_logical_writes)/(cp.usecounts*1.) as decimal(12,2)) AS avg_io
        ,SUM(qs.total_elapsed_time)/(cp.usecounts)/1000 as avg_elapsed_time_ms
        ,st.text AS sql_text
        ,qs.plan_handle
    FROM sys.dm_exec_query_stats qs
        INNER JOIN sys.dm_exec_cached_plans cp ON qs.plan_handle = cp.plan_handle
        CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
    WHERE (DB_NAME(st.dbid) = @DatabaseName OR NULLIF(@DatabaseName,'') IS NULL)
    AND (OBJECT_NAME(st.objectid, st.dbid) = @ObjectName OR NULLIF(@ObjectName, '') IS NULL)
    GROUP BY st.dbid, st.objectid, cp.usecounts, st.text, qs.plan_handle
)
SELECT cte.database_name
    ,cte.object_name
    ,cte.usecounts
    ,cte.avg_cpu_time
    ,cte.avg_io
    ,CONVERT(varchar, DATEADD(ms, cte.avg_elapsed_time_ms, 0), 114) AS avg_elapsed_time
    ,qp.query_plan
    ,cte.sql_text
FROM cteExecInfo cte
    OUTER APPLY sys.dm_exec_query_plan(cte.plan_handle) qp
ORDER BY cte.usecounts DESC

I’ve often found this procedure useful when people complain about the execution of a procedure. Instead of pulling out performance tools and running test versions of the procedure this will provide the last plan that was used for the procedure.

Querying for Parallelism

While working on some parallelism issues today I started using the query that detects whether there are queries currently running utilizing parallelism.  It’s the same query that can be found in the white paper Troubleshooting Performance Problems in SQL Server 2005.

Which I’ve included below:

SELECT 

    r.session_id,

    r.request_id,

    MAX(ISNULL(exec_context_id, 0)) as number_of_workers,

    r.sql_handle,

    r.statement_start_offset,

    r.statement_end_offset,

    r.plan_handle

FROM 

    sys.dm_exec_requests r

    JOIN sys.dm_os_tasks t ON r.session_id = t.session_id

    JOIN sys.dm_exec_sessions s ON r.session_id = s.session_id

WHERE 

    s.is_user_process = 0x1

GROUP BY 

    r.session_id, r.request_id, 

    r.sql_handle, r.plan_handle, 

    r.statement_start_offset, r.statement_end_offset

HAVING MAX(ISNULL(exec_context_id, 0)) > 0

It sure is to know the queries currently executing using parallelism.  But the query doesn’t provide enough information when it executes to take some immediate action without further research.  The query below provides this additional information.

WITH CurrentParallelism

AS (

    SELECT 

        r.session_id

        ,r.request_id

        ,r.database_id

        ,MAX(ISNULL(exec_context_id, 0)) as number_of_workers

        ,r.sql_handle

        ,r.statement_start_offset

        ,r.statement_end_offset

        ,r.plan_handle

    FROM sys.dm_exec_requests r

        INNER JOIN sys.dm_os_tasks t on r.session_id = t.session_id

        INNER JOIN sys.dm_exec_sessions s on r.session_id = s.session_id

    WHERE s.is_user_process = 0x1

    GROUP BY 

        r.session_id, r.request_id, r.database_id,

        r.sql_handle, r.plan_handle, 

        r.statement_start_offset, r.statement_end_offset

    HAVING MAX(ISNULL(exec_context_id, 0)) > 0

)

SELECT

    cp.session_id

    ,cp.request_id

    ,db_name(cp.database_id) AS database_name

    ,cp.number_of_workers

    ,CASE statement_end_offset WHEN -1 THEN q.text

        ELSE SUBSTRING(q.text, statement_start_offset/2, (statement_end_offset-statement_start_offset)/2) END as sql_statement

    ,p.query_plan

    ,q.text

    ,cp.plan_handle

FROM CurrentParallelism cp

    CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p

    CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) as q

The information that I like to have on hand is the exact statement and query plan that is executing so that I know what it is that I am dealing with.  Hopefully others feel the same and find some use in this.

Snapshots of Index Contention

Occasionally, I need to determine where index hotspots are located within databases.  To do this, I use the DMV sys.dm_db_index_operational_stats.  I’ve talked about this DMV before while I was going through the index analysis blogs posts.

This DMV returns locking and blocking information for tables and indexes in the database.  This information is aggregated since the last time the SQL Server service was started or when the metadata cache for the index is reset.  Inactivity of an index or DDL operations are the types of a activities that can cause the metadata cache to be reset.

Sometimes though index contention may not have occurred when you are currently knee deep in performance issues.  And as such, finding the hotspot at the time of the issue is more important that knowing that the contention happened at some point since the statistics were last reset.  So just go out and restart the SQL Server service because that’s makes it easy.

To get a snapshot of the index contention without forcing the metadata cache to reset, it is necessary to baseline the current state of contention and then compare it to a future state after a period of time has past.  In the script below, the query provides a 5 minute snapshot of index contention.

DECLARE @dbid INT;

 

SET @dbid=DB_ID();

 

DECLARE @IndexOperationalStats table

    (

    database_id int

    ,object_id int

    ,index_id int

    ,partition_number int

    ,row_lock_count bigint

    ,row_lock_wait_count bigint

    ,row_lock_wait_in_ms bigint

    )

 

--INSERT INTO @IndexOperationalStats

SELECT database_id

    ,object_id

    ,index_id

    ,partition_number

    ,row_lock_count

    ,row_lock_wait_count

    ,row_lock_wait_in_ms

FROM sys.dm_db_index_operational_stats (@dbid, NULL, NULL, NULL) s

 

WAITFOR DELAY '00:05:00.000'

 

;WITH IndexOperationalStats_Delta

AS (

    SELECT s.database_id

        ,s.object_id

        ,s.index_id

        ,s.partition_number

        ,s.row_lock_count - COALESCE(t.row_lock_count, 0) as row_lock_count

        ,s.row_lock_wait_count - COALESCE(t.row_lock_wait_count, 0) as row_lock_wait_count

        ,s.row_lock_wait_in_ms - COALESCE(t.row_lock_wait_in_ms, 0) as row_lock_wait_in_ms

    FROM sys.dm_db_index_operational_stats (@dbid, NULL, NULL, NULL) s

        LEFT OUTER JOIN @IndexOperationalStats t 

            ON s.database_id = t.database_id

            AND s.object_id = t.object_id 

            AND s.index_id = t.index_id

            AND s.partition_number = t.partition_number

    WHERE OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1

)

SELECT 

    DB_NAME(database_id) AS DbName

    ,OBJECT_NAME(ios.[object_id]) AS ObjName

    ,i.name AS IndexName

    ,i.index_id

    ,partition_number

    ,row_lock_count

    ,row_lock_wait_count 

    ,CAST (100.0 * row_lock_wait_count / (1 + row_lock_count) AS NUMERIC(15,2)) AS [block %]

    ,row_lock_wait_in_ms

    ,CAST (1.0 * row_lock_wait_in_ms / (1 + row_lock_wait_count) AS NUMERIC(15,2)) AS [Avg Row Lock Waits (ms)]

FROM IndexOperationalStats_Delta ios

    INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id

ORDER BY row_lock_wait_count DESC

Analyze This – Analyze Your Indexes – Part 6

This is the sixth and last for now post in which I’m going through a query that can be used to analyze indexes.  Thus far the following has been added to the query:

  1. Combine existing index statistics and missing indexes into a single output (blog)
  2. Find existing duplicate and overlapping indexes (blog)
  3. Find the relationship between indexes and foreign keys (blog)
  4. Added size and current memory utilization (blog)
  5. Add in index analysis guidelines (blog)

If you’ve used the last few versions of this query you may have noticed that performance against databases with a large number of indexes wasn’t all that great.  Also, the query didn’t account for partitions which I had planned to resolve at a later date.  Oh, look – it’s a later date.

In the first few versions of the index analysis query, new features to the query were added through the use of CTEs.  The trouble with the way the query was put together was that it caused it to choke when performing as the volume of indexes being reviewed increased.  In a few sample executions, the performance has improved by an order of magnitude.  In one case the execution went from almost 3 minutes to 20 seconds.

This version of the index analysis query also now works with SQL Server 2005 RTM.  There was a few functions being used that were added in SQL Server 2005 SP 2.  Those have been removed to allow for compatibility back to SQL Server 2005 RTM.

A new column that flags whether of not a table has a unique index on it was added.  This was added based on a request in a comment a couple posts ago.

Besides fixing the items listed above, there was a need to review indexes to check for blocking.  To do this the DMV sys.dm_db_index_operational_stats should be used.  This DMV provides I/O, locking, latching, and access method activity for each partition of a table or index.  Only blocking statistics have been with the index analysis at this time.

The last thing that could have been added to the query was fragmentation information from ys.dm_db_index_physical_stats.  This was not included because the execution of this DMV is invasive even when using SIMPLE mode.

The new version of the query is as follows:

DECLARE @ObjectID int

    ,@DB_ID int

 

SELECT @ObjectID = OBJECT_ID('')

    ,@DB_ID = db_id()

 

IF OBJECT_ID('tempdb..#IndexBaseLine') IS NOT NULL

    DROP TABLE #IndexBaseLine

 

CREATE TABLE #IndexBaseLine

    (

    row_id int IDENTITY(1,1)

    ,index_action varchar(10)

    ,schema_id int

    ,schema_name sysname

    ,object_id int

    ,table_name sysname

    ,index_id int

    ,index_name nvarchar(128)

    ,is_unique bit DEFAULT(0)

    ,has_unique bit DEFAULT(0)

    ,type_desc nvarchar(67)

    ,partition_number int

    ,reserved_page_count bigint

    ,size_in_mb decimal(12, 2)

    ,buffered_page_count int

    ,buffer_mb decimal(12, 2)

    ,pct_in_buffer decimal(12, 2)

    ,table_buffer_mb decimal(12, 2)

    ,row_count bigint

    ,impact int

    ,existing_ranking bigint

    ,user_total bigint

    ,user_total_pct decimal(6, 2)

    ,estimated_user_total_pct decimal(6, 2)

    ,user_seeks bigint

    ,user_scans bigint

    ,user_lookups bigint

    ,row_lock_count bigint

    ,row_lock_wait_count bigint

    ,row_lock_wait_in_ms bigint

    ,row_block_pct decimal(6, 2)

    ,avg_row_lock_waits_ms bigint

    ,indexed_columns nvarchar(max)

    ,included_columns nvarchar(max)

    ,indexed_columns_compare nvarchar(max)

    ,included_columns_compare nvarchar(max)

    ,duplicate_indexes nvarchar(max)

    ,overlapping_indexes nvarchar(max)

    ,related_foreign_keys nvarchar(max)

    ,related_foreign_keys_xml xml

    )

 

IF OBJECT_ID('tempdb..#ForeignKeys') IS NOT NULL

    DROP TABLE #ForeignKeys

 

CREATE TABLE #ForeignKeys

    (

    foreign_key_name sysname

    ,object_id int

    ,fk_columns nvarchar(max)

    ,fk_columns_compare nvarchar(max)

    )

 

;WITH AllocationUnits

AS (

    SELECT p.object_id

        ,p.index_id

        ,p.partition_number 

        ,au.allocation_unit_id

    FROM sys.allocation_units AS au

        INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3)

    UNION ALL

    SELECT p.object_id

        ,p.index_id

        ,p.partition_number 

        ,au.allocation_unit_id

    FROM sys.allocation_units AS au

        INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id AND au.type = 2

),MemoryBuffer

AS (

    SELECT au.object_id

        ,au.index_id

        ,au.partition_number

        ,COUNT(*)AS buffered_page_count

        ,CONVERT(decimal(12,2), CAST(COUNT(*) as bigint)*CAST(8 as float)/1024) as buffer_mb

    FROM sys.dm_os_buffer_descriptors AS bd 

        INNER JOIN AllocationUnits au ON bd.allocation_unit_id = au.allocation_unit_id

    WHERE bd.database_id = db_id()

    GROUP BY au.object_id, au.index_id, au.partition_number

)

INSERT INTO #IndexBaseLine

    (schema_id, schema_name, object_id, table_name, index_id, index_name, is_unique, type_desc, partition_number, reserved_page_count, size_in_mb, buffered_page_count, buffer_mb, pct_in_buffer, row_count, existing_ranking, user_total, user_total_pct, user_seeks, user_scans, user_lookups, row_lock_count, row_lock_wait_count, row_lock_wait_in_ms, row_block_pct, avg_row_lock_waits_ms, indexed_columns, included_columns, indexed_columns_compare, included_columns_compare)

SELECT s.schema_id

    ,s.name as schema_name

    ,t.object_id

    ,t.name as table_name

    ,i.index_id

    ,COALESCE(i.name, 'N/A') as index_name

    ,i.is_unique

    ,CASE WHEN i.is_unique = 1 THEN 'UNIQUE ' ELSE '' END + i.type_desc as type_desc

    ,ps.partition_number

    ,ps.reserved_page_count 

    ,CAST(reserved_page_count * CAST(8 as float) / 1024 as decimal(12,2)) as size_in_mb

    ,mb.buffered_page_count

    ,mb.buffer_mb

    ,CAST(100*buffer_mb/NULLIF(CAST(reserved_page_count * CAST(8 as float) / 1024 as decimal(12,2)),0) AS decimal(12,2)) as pct_in_buffer

    ,row_count

    ,ROW_NUMBER() 

        OVER (PARTITION BY i.object_id ORDER BY i.is_primary_key desc, ius.user_seeks + ius.user_scans + ius.user_lookups desc) as existing_ranking

    ,ius.user_seeks + ius.user_scans + ius.user_lookups as user_total

    ,COALESCE(CAST(100 * (ius.user_seeks + ius.user_scans + ius.user_lookups)

        /(NULLIF(SUM(ius.user_seeks + ius.user_scans + ius.user_lookups) 

        OVER(PARTITION BY i.object_id), 0) * 1.) as decimal(6,2)),0) as user_total_pct

    ,ius.user_seeks

    ,ius.user_scans

    ,ius.user_lookups

    ,ios.row_lock_count 

    ,ios.row_lock_wait_count 

    ,ios.row_lock_wait_in_ms 

    ,CAST(100.0 * ios.row_lock_wait_count/NULLIF(ios.row_lock_count, 0) AS decimal(12,2)) AS row_block_pct 

    ,CAST(1. * ios.row_lock_wait_in_ms /NULLIF(ios.row_lock_wait_count, 0) AS decimal(12,2)) AS avg_row_lock_waits_ms 

    ,STUFF((SELECT ', ' + QUOTENAME(c.name)

            FROM sys.index_columns ic

                INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id

            WHERE i.object_id = ic.object_id

            AND i.index_id = ic.index_id

            AND is_included_column = 0

            ORDER BY key_ordinal ASC

            FOR XML PATH('')), 1, 2, '') AS indexed_columns

    ,STUFF((SELECT ', ' + QUOTENAME(c.name)

            FROM sys.index_columns ic

                INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id

            WHERE i.object_id = ic.object_id

            AND i.index_id = ic.index_id

            AND is_included_column = 1

            ORDER BY key_ordinal ASC

            FOR XML PATH('')), 1, 2, '') AS included_columns

    ,(SELECT QUOTENAME(ic.column_id,'(')

            FROM sys.index_columns ic

            WHERE i.object_id = ic.object_id

            AND i.index_id = ic.index_id

            AND is_included_column = 0

            ORDER BY key_ordinal ASC

            FOR XML PATH('')) AS indexed_columns_compare

    ,COALESCE((SELECT QUOTENAME(ic.column_id, '(')

            FROM sys.index_columns ic

            WHERE i.object_id = ic.object_id

            AND i.index_id = ic.index_id

            AND is_included_column = 1

            ORDER BY key_ordinal ASC

            FOR XML PATH('')), SPACE(0)) AS included_columns_compare

FROM sys.tables t

    INNER JOIN sys.schemas s ON t.schema_id = s.schema_id

    INNER JOIN sys.indexes i ON t.object_id = i.object_id

    INNER JOIN sys.dm_db_partition_stats ps ON i.object_id = ps.object_id AND i.index_id = ps.index_id

    LEFT OUTER JOIN sys.dm_db_index_usage_stats ius ON i.object_id = ius.object_id AND i.index_id = ius.index_id AND ius.database_id = db_id()

    LEFT OUTER JOIN sys.dm_db_index_operational_stats(@DB_ID, NULL, NULL, NULL) ios ON ps.object_id = ios.object_id AND ps.index_id = ios.index_id AND ps.partition_number = ios.partition_number

    LEFT OUTER JOIN MemoryBuffer mb ON ps.object_id = mb.object_id AND ps.index_id = mb.index_id AND ps.partition_number = mb.partition_number

WHERE t.object_id = @ObjectID OR @ObjectID IS NULL

 

INSERT INTO #IndexBaseLine

    (schema_id, schema_name, object_id, table_name, index_name, type_desc, impact, existing_ranking, user_total, user_seeks, user_scans, user_lookups, indexed_columns, included_columns)

SELECT s.schema_id

    ,s.name AS schema_name

    ,t.object_id

    ,t.name AS table_name

    ,'--MISSING--' AS index_name

    ,'--NONCLUSTERED--' AS type_desc

    ,(migs.user_seeks + migs.user_scans) * migs.avg_user_impact as impact

    ,0 AS existing_ranking

    ,migs.user_seeks + migs.user_scans as user_total

    ,migs.user_seeks 

    ,migs.user_scans

    ,0 as user_lookups

    ,COALESCE(equality_columns + ', ', SPACE(0)) + COALESCE(inequality_columns, SPACE(0)) as indexed_columns

    ,included_columns

FROM sys.tables t

    INNER JOIN sys.schemas s ON t.schema_id = s.schema_id

    INNER JOIN sys.dm_db_missing_index_details mid ON t.object_id = mid.object_id

    INNER JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle

    INNER JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle

WHERE mid.database_id = db_id()

AND (mid.object_id = @ObjectID OR @ObjectID IS NULL)

 

INSERT INTO #ForeignKeys

    (foreign_key_name, object_id, fk_columns, fk_columns_compare)

SELECT fk.name + '|PARENT' AS foreign_key_name

    ,fkc.parent_object_id AS object_id

    ,STUFF((SELECT ', ' + QUOTENAME(c.name)

            FROM sys.foreign_key_columns ifkc

                INNER JOIN sys.columns c ON ifkc.parent_object_id = c.object_id AND ifkc.parent_column_id = c.column_id

            WHERE fk.object_id = ifkc.constraint_object_id

            ORDER BY ifkc.constraint_column_id

            FOR XML PATH('')), 1, 2, '') AS fk_columns

    ,(SELECT QUOTENAME(ifkc.parent_column_id,'(')

            FROM sys.foreign_key_columns ifkc

            WHERE fk.object_id = ifkc.constraint_object_id

            ORDER BY ifkc.constraint_column_id

            FOR XML PATH('')) AS fk_columns_compare

FROM sys.foreign_keys fk

    INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id

WHERE fkc.constraint_column_id = 1

AND (fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL)

UNION ALL

SELECT fk.name + '|REFERENCED' as foreign_key_name

    ,fkc.referenced_object_id AS object_id

    ,STUFF((SELECT ', ' + QUOTENAME(c.name)

            FROM sys.foreign_key_columns ifkc

                INNER JOIN sys.columns c ON ifkc.referenced_object_id = c.object_id AND ifkc.referenced_column_id = c.column_id

            WHERE fk.object_id = ifkc.constraint_object_id

            ORDER BY ifkc.constraint_column_id

            FOR XML PATH('')), 1, 2, '') AS fk_columns

    ,(SELECT QUOTENAME(ifkc.referenced_column_id,'(')

            FROM sys.foreign_key_columns ifkc

            WHERE fk.object_id = ifkc.constraint_object_id

            ORDER BY ifkc.constraint_column_id

            FOR XML PATH('')) AS fk_columns_compare

FROM sys.foreign_keys fk

    INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id

WHERE fkc.constraint_column_id = 1

AND (fkc.referenced_object_id = @ObjectID OR @ObjectID IS NULL)

 

UPDATE ibl

SET duplicate_indexes = STUFF((SELECT ', ' + index_name AS [data()]

            FROM #IndexBaseLine iibl

            WHERE ibl.object_id = iibl.object_id

            AND ibl.index_id <> iibl.index_id

            AND ibl.indexed_columns_compare = iibl.indexed_columns_compare

            AND ibl.included_columns_compare = iibl.included_columns_compare

            FOR XML PATH('')), 1, 2, '')

    ,overlapping_indexes = STUFF((SELECT ', ' + index_name AS [data()]

            FROM #IndexBaseLine iibl

            WHERE ibl.object_id = iibl.object_id

            AND ibl.index_id <> iibl.index_id

            AND (ibl.indexed_columns_compare LIKE iibl.indexed_columns_compare + '%' 

                OR iibl.indexed_columns_compare LIKE ibl.indexed_columns_compare + '%')

            AND ibl.indexed_columns_compare <> iibl.indexed_columns_compare 

            FOR XML PATH('')), 1, 2, '')

    ,related_foreign_keys = STUFF((SELECT ', ' + foreign_key_name AS [data()]

            FROM #ForeignKeys ifk

            WHERE ifk.object_id = ibl.object_id

            AND ibl.indexed_columns_compare LIKE ifk.fk_columns_compare + '%'

            FOR XML PATH('')), 1, 2, '')

    ,related_foreign_keys_xml = CAST((SELECT foreign_key_name

            FROM #ForeignKeys ForeignKeys

            WHERE ForeignKeys.object_id = ibl.object_id

            AND ibl.indexed_columns_compare LIKE ForeignKeys.fk_columns_compare + '%'

            FOR XML AUTO) as xml) 

FROM #IndexBaseLine ibl

 

INSERT INTO #IndexBaseLine

    (schema_id, schema_name, object_id, table_name, index_name, type_desc, existing_ranking, indexed_columns)

SELECT s.schema_id

    ,s.name AS schema_name

    ,t.object_id

    ,t.name AS table_name

    ,fk.foreign_key_name AS index_name

    ,'--MISSING FOREIGN KEY--' as type_desc

    ,9999

    ,fk.fk_columns

FROM sys.tables t

    INNER JOIN sys.schemas s ON t.schema_id = s.schema_id

    INNER JOIN #ForeignKeys fk ON t.object_id = fk.object_id

    LEFT OUTER JOIN #IndexBaseLine ia ON fk.object_id = ia.object_id AND ia.indexed_columns_compare LIKE fk.fk_columns_compare + '%'

WHERE ia.index_name IS NULL

 

;WITH Aggregation

AS (

    SELECT row_id

        ,CAST(100. * (user_seeks + user_scans + user_lookups)

            /(NULLIF(SUM(user_seeks + user_scans + user_lookups) 

            OVER(PARTITION BY schema_name, table_name), 0) * 1.) as decimal(12,2)) AS estimated_user_total_pct

        ,SUM(buffer_mb) OVER(PARTITION BY schema_name, table_name) as table_buffer_mb

    FROM #IndexBaseLine 

)

UPDATE ibl

SET estimated_user_total_pct = COALESCE(a.estimated_user_total_pct, 0)

    ,table_buffer_mb = a.table_buffer_mb

FROM #IndexBaseLine ibl

    INNER JOIN Aggregation a ON ibl.row_id = a.row_id

 

;WITH IndexAction

AS (

    SELECT row_id

        ,CASE WHEN user_lookups > user_seeks AND type_desc IN ('CLUSTERED', 'HEAP', 'UNIQUE CLUSTERED') THEN 'REALIGN'

            WHEN type_desc = '--MISSING FOREIGN KEY--' THEN 'CREATE'

            WHEN type_desc = 'XML' THEN '---'

            WHEN is_unique = 1 THEN '---'

            WHEN type_desc = '--NONCLUSTERED--' AND ROW_NUMBER() OVER (PARTITION BY table_name ORDER BY user_total desc) <= 10 AND estimated_user_total_pct > 1 THEN 'CREATE'

            WHEN type_desc = '--NONCLUSTERED--' THEN 'BLEND'

            WHEN ROW_NUMBER() OVER (PARTITION BY table_name ORDER BY user_total desc, existing_ranking) > 10 THEN 'DROP' 

            WHEN user_total = 0 THEN 'DROP' 

            ELSE '---' END AS index_action

    FROM #IndexBaseLine

)

UPDATE ibl

SET index_action = ia.index_action

FROM #IndexBaseLine ibl INNER JOIN IndexAction ia

ON ibl.row_id = ia.row_id

 

UPDATE ibl

SET has_unique = 1

FROM #IndexBaseLine ibl

    INNER JOIN (SELECT DISTINCT object_id FROM sys.indexes i WHERE i.is_unique = 1) x ON ibl.object_id = x.object_id

 

SELECT 

    index_action

    ,schema_name + '.' + table_name as object_name

    ,index_name

    ,is_unique

    ,has_unique

    ,type_desc

    ,size_in_mb

    ,buffer_mb

    ,pct_in_buffer

    ,table_buffer_mb

    ,row_count

    ,user_total

    ,user_total_pct

    ,estimated_user_total_pct

    ,user_seeks

    ,user_scans

    ,user_lookups

    ,row_lock_count

    ,row_lock_wait_count

    ,row_lock_wait_in_ms

    ,row_block_pct

    ,avg_row_lock_waits_ms

    ,indexed_columns

    ,included_columns

    ,duplicate_indexes

    ,overlapping_indexes

    ,related_foreign_keys

    ,related_foreign_keys_xml 

FROM #IndexBaseLine

ORDER BY table_buffer_mb DESC, object_id, user_total DESC

NOTE: Populating the table name in the function for the @ObjectID variable is optional.  If the value of @ObjectID is NULL then information on all tables is returned.

This version of the query populates the following columns:

  • row_id: Row identifier used for populating the table
  • index_action: Analysis recommendation on action to take on the index
    • CREATE: Recommend adding the index to the table.
    • DROP: Recommend dropping the index from the table
    • BLEND: Review the missing index details to see if the missing index details can be added to an existing index.
    • REALIGN: Bookmark lookups on the index exceed the number of seeks on the table.  Recommend investigating whether to move the clustered index to another index or add included columns to the indexes that are part of the bookmark lookups.
  • schema_id: Schema ID
  • schema_name: Name of the schema.
  • object_id: Object ID
  • table_name: Name of the table name
  • index_id: Index ID
  • index_name: Name of the index.
  • is_unique: Flag indicating whether an index has a unique index.
  • has_unique: Flag indicating whether the table has a unique index.
  • type_desc: Type of index; either clustered or non-clustered.
  • partition_number: Partition number.
  • reserved_page_count: Total number of pages reserved for the index.
  • size_in_mb: The amount of space in MB the index utilizes on disk.
  • buffered_page_count: Total number of pages in the buffer for the index.
  • buffer_mb: The amount of space in MB in the buffer for the index.
  • pct_in_buffer: The percentage of an index that is current in the SQL Server buffer.
  • table_buffer_mb: The amount of space in MB in the SQL Server buffer that is being utilized by the table.
  • row_count: Number of rows in the index.
  • impact: Calculation of impact of a potential index.  This is based on the seeks and scans that the index could have utilized multiplied by average improvement the index would have provided.  This is included only for missing indexes.
  • existing_ranking: Ranking of the existing indexes ordered by user_total descending across the indexes for the table.
  • user_total: Total number of seek, scan, and lookup operations for the index.
  • user_total_pct: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing indexes for the table.
  • estimated_user_total_pct: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing and potential indexes for the table.  This number is naturally skewed because a seek for potential Index A resulted in another operation on an existing index and both of these operations would be counted.
  • user_seeks: Number of seek operations on the index.
  • user_scans: Number of scan operations on the index.
  • user_lookups: Number of lookup operations on the index.
  • row_lock_count: Cumulative number of row locks requested.
  • row_lock_wait_count: Cumulative number of times the Database Engine waited on a row lock.
  • row_lock_wait_in_ms: Total number of milliseconds the Database Engine waited on a row lock.
  • row_block_pct: Percentage of row locks that encounter waits on a row lock.
  • avg_row_lock_waits_ms: Average number of milliseconds the Database Engine waited on a row lock.
  • indexed_columns: Columns that are part of the index, missing index or foreign key.
  • included_columns: Columns that are included in the index or missing index.
  • indexed_columns_compare: Column IDs that are part of the index, missing index or foreign key
  • included_columns_compare: Column IDs that are included in the index or missing index.
  • duplicate_indexes: List of Indexes that exist on the table that are identical to the index on this row.
  • overlapping_indexes: List of Indexes that exist on the table that overlap the index on this row.
  • related_foreign_keys: List of foreign keys that are related to the index either as an exact match or covering index.
  • related_foreign_keys_xml: XML document listing foreign keys that are related to the index either as an exact match or covering index.

As I mentioned in a previous post, when looking at the index_action column don’t take a literal translation of the recommendation.  These are put together from my personal experience and used as a pointer towards a direction versus a specific directive.

Hopefully breaking apart the index analysis query into multiple posts has helped people grasp why each piece was added and learn a little bit more about the underlying DMVs and catalog views used to put this together.

In future posts some examples will be put together to look at how to take snapshots of indexes with this query.  And also to take the recommendations and implement them to improve the indexing on databases.

EDIT May 18, 2009: Fixed error in aggregating type 2 sys.partition data.

Analyze This – Analyze Your Indexes – Part 5

This is the fifth post in which I am walking through a query that I use to analyze indexes.  Thus far I have added to the following to the query that I use:

  1. Combine existing index statistics and missing indexes into a single output (blog)
  2. Find existing duplicate and overlapping indexes (blog)
  3. Find the relationship between indexes and foreign keys (blog)
  4. Added size and current memory utilization (blog)

While testing the current version of the index analysis query I found an issue with the foreign key portion of the filtering that has been fixed.  Where the code was:

WHERE fkc.constraint_column_id = 1

AND (fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL)

It should have been:

WHERE fkc.constraint_column_id = 1

AND (fkc.referenced_object_id = @ObjectID OR @ObjectID IS NULL)

For the next version of the index analysis query I wanted to build in some guidance on how to handle indexes on the table.  There is a couple things I wanted to be able to tell at a glance without much thought.

  1. When could clustered indexes be realigned on a table to improve performance on a table.  This idea came about after reading this article on MSSQLTips.  The logic used evaluates user_lookups against user_seeks.
  2. When the index count exceed 10 indexes on the table flag the indexes to be dropped.  At some point, I read an index best practices the provided guidance around 5-10 indexes per table in OLTP environments.  If anyone has a link to an article on this please let me know, the one I read I lost track of.
  3. Any indexes that could be built for foreign keys should be flagged to be created.

This information is found in a new columns called index_action.  The new version of the query is as follows:

DECLARE @ObjectID int

SELECT @ObjectID = OBJECT_ID('')

 

;WITH IndexSize

AS(

    SELECT object_id

        ,index_id

        ,CAST((SUM(reserved_page_count) * CAST(8 as float))/1024 as decimal(12,2)) as size_in_mb

    FROM sys.dm_db_partition_stats

    GROUP BY object_id, index_id

)

,preIndexAnalysis

AS (

    SELECT 

        OBJECT_SCHEMA_NAME(t.object_id) as schema_name

        ,t.name as table_name

        ,COALESCE(i.name, 'N/A') as index_name

        ,i.is_unique

        ,CASE WHEN i.is_unique = 1 THEN 'UNIQUE ' ELSE '' END + i.type_desc as type_desc

        ,iz.size_in_mb

        ,NULL as impact

        ,ROW_NUMBER() 

            OVER (PARTITION BY i.object_id ORDER BY i.is_primary_key desc, ius.user_seeks + ius.user_scans + ius.user_lookups desc) as ranking

        ,ius.user_seeks + ius.user_scans + ius.user_lookups as user_total

        ,COALESCE(CAST(100 * (ius.user_seeks + ius.user_scans + ius.user_lookups)

            /(NULLIF(SUM(ius.user_seeks + ius.user_scans + ius.user_lookups) 

            OVER(PARTITION BY i.object_id), 0) * 1.) as decimal(6,2)),0) as user_total_pct

        ,ius.user_seeks

        ,ius.user_scans

        ,ius.user_lookups

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.index_columns ic

                    INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 0

                ORDER BY key_ordinal ASC

                FOR XML PATH('')), 1, 2, '') AS indexed_columns

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.index_columns ic

                    INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 1

                ORDER BY key_ordinal ASC

                FOR XML PATH('')), 1, 2, '') AS included_columns

        ,i.object_id

        ,i.index_id

        ,(SELECT QUOTENAME(ic.column_id,'(')

                FROM sys.index_columns ic

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 0

                ORDER BY key_ordinal ASC

                FOR XML PATH('')) AS indexed_columns_compare

        ,COALESCE((SELECT QUOTENAME(ic.column_id, '(')

                FROM sys.index_columns ic

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 1

                ORDER BY key_ordinal ASC

                FOR XML PATH('')), SPACE(0)) AS included_columns_compare

    FROM sys.tables t

        INNER JOIN sys.indexes i ON t.object_id = i.object_id

        LEFT OUTER JOIN sys.dm_db_index_usage_stats ius ON i.object_id = ius.object_id AND i.index_id = ius.index_id AND ius.database_id = db_id()

        INNER JOIN IndexSize iz ON i.object_id = iz.object_id AND i.index_id = iz.index_id

    WHERE t.object_id = @ObjectID OR @ObjectID IS NULL

    UNION ALL

    SELECT 

        OBJECT_SCHEMA_NAME(mid.object_id) as schema_name

        ,OBJECT_NAME(mid.object_id) as table_name

        ,'--MISSING--'

        ,NULL

        ,'--NONCLUSTERED--'

        ,NULL

        ,(migs.user_seeks + migs.user_scans) * migs.avg_user_impact as impact

        ,0 as ranking

        ,migs.user_seeks + migs.user_scans as user_total

        ,NULL as user_total_pct

        ,migs.user_seeks 

        ,migs.user_scans

        ,0 as user_lookups

        ,COALESCE(equality_columns + ', ', SPACE(0)) + COALESCE(inequality_columns, SPACE(0)) as indexed_columns

        ,included_columns

        ,mid.object_id

        ,NULL

        ,NULL

        ,NULL

    FROM sys.dm_db_missing_index_details mid

        INNER JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle

        INNER JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle

    WHERE database_id = db_id()

    AND (mid.object_id = @ObjectID OR @ObjectID IS NULL)

)

,ForeignKeys

AS (

    SELECT fk.name + '|PARENT' AS name

        ,fkc.parent_object_id AS object_id

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.foreign_key_columns ifkc

                    INNER JOIN sys.columns c ON ifkc.parent_object_id = c.object_id AND ifkc.parent_column_id = c.column_id

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')), 1, 2, '') AS fk_columns

        ,(SELECT QUOTENAME(ifkc.parent_column_id,'(')

                FROM sys.foreign_key_columns ifkc

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')) AS fk_columns_compare

    FROM sys.foreign_keys fk

        INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id

    WHERE fkc.constraint_column_id = 1

    AND (fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL)

    UNION ALL

    SELECT fk.name + '|REFERENCED' as name

        ,fkc.referenced_object_id AS object_id

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.foreign_key_columns ifkc

                    INNER JOIN sys.columns c ON ifkc.referenced_object_id = c.object_id AND ifkc.referenced_column_id = c.column_id

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')), 1, 2, '') AS fk_columns

        ,(SELECT QUOTENAME(ifkc.referenced_column_id,'(')

                FROM sys.foreign_key_columns ifkc

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')) AS fk_columns_compare

    FROM sys.foreign_keys fk

        INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id

    WHERE fkc.constraint_column_id = 1

    AND (fkc.referenced_object_id = @ObjectID OR @ObjectID IS NULL)

)

,MemoryBuffer

AS (

    SELECT 

        obj.object_id

        ,obj.index_id

        ,COUNT(*)AS Buffered_Page_Count

        ,CONVERT(decimal(12,2), CAST(COUNT(*) as bigint)*CAST(8 as float)/1024) as buffer_mb

    FROM sys.dm_os_buffer_descriptors AS bd 

        INNER JOIN 

        (

            SELECT object_name(object_id) AS name 

                ,index_id ,allocation_unit_id, object_id

            FROM sys.allocation_units AS au

                INNER JOIN sys.partitions AS p 

                    ON au.container_id = p.hobt_id 

                        AND (au.type = 1 OR au.type = 3)

            UNION ALL

            SELECT object_name(object_id) AS name   

                ,index_id, allocation_unit_id, object_id

            FROM sys.allocation_units AS au

                INNER JOIN sys.partitions AS p 

                    ON au.container_id = p.hobt_id 

                        AND au.type = 2

        ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id

    WHERE database_id = db_id()

    GROUP BY obj.object_id ,obj.index_id

)

, IndexAnalysis

AS (

    SELECT ia.object_id

        ,ia.index_id

        ,schema_name

        ,table_name

        ,index_name

        ,is_unique

        ,type_desc

        ,size_in_mb

        ,impact

        ,ranking

        ,user_total

        ,user_total_pct

        ,CAST(100 * (user_seeks + user_scans + user_lookups)

            /(NULLIF(SUM(user_seeks + user_scans + user_lookups) 

            OVER(PARTITION BY schema_name, table_name), 0) * 1.) as decimal(6,2)) as estimated_percent

        ,user_seeks

        ,user_scans

        ,user_lookups

        ,indexed_columns

        ,included_columns

        ,STUFF((SELECT ', ' + index_name AS [data()]

            FROM preIndexAnalysis iia

            WHERE ia.object_id = iia.object_id

            AND ia.index_id <> iia.index_id

            AND ia.indexed_columns_compare = iia.indexed_columns_compare

            AND ia.included_columns_compare = iia.included_columns_compare

            FOR XML PATH('')), 1, 2, '') AS duplicate_indexes

        ,STUFF((SELECT ', ' + index_name AS [data()]

            FROM preIndexAnalysis iia

            WHERE ia.object_id = iia.object_id

            AND ia.index_id <> iia.index_id

            AND (ia.indexed_columns_compare LIKE iia.indexed_columns_compare + '%' 

                OR iia.indexed_columns_compare LIKE ia.indexed_columns_compare + '%')

            AND ia.indexed_columns_compare <> iia.indexed_columns_compare 

            FOR XML PATH('')), 1, 2, '') AS overlapping_indexes

        ,STUFF((SELECT ', ' + name AS [data()]

            FROM ForeignKeys ifk

            WHERE ifk.object_id = ia.object_id

            AND ia.indexed_columns_compare LIKE ifk.fk_columns_compare + '%'

            FOR XML PATH('')), 1, 2, '') AS related_foreign_keys

        ,CAST((SELECT name

            FROM ForeignKeys

            WHERE ForeignKeys.object_id = ia.object_id

            AND ia.indexed_columns_compare LIKE ForeignKeys.fk_columns_compare + '%'

            FOR XML AUTO) as xml) AS related_foreign_keys_xml

    FROM preIndexAnalysis ia

    UNION ALL

    SELECT fk.object_id

        ,NULL

        ,OBJECT_SCHEMA_NAME(fk.object_id) AS schema_name

        ,OBJECT_NAME(fk.object_id) AS table_name

        ,fk.name AS index_name

        ,NULL

        ,'--MISSING FOREIGN KEY--'

        ,NULL

        ,NULL 

        ,9999

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL

        ,fk.fk_columns

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL

    FROM ForeignKeys fk

        LEFT OUTER JOIN preIndexAnalysis ia

    ON fk.object_id = ia.object_id

    AND ia.indexed_columns_compare LIKE fk.fk_columns_compare + '%'

    WHERE ia.index_name IS NULL

)

SELECT schema_name

    ,table_name

    ,index_name

    ,type_desc

    ,CASE WHEN user_lookups > user_seeks AND type_desc IN ('CLUSTERED', 'HEAP', 'UNIQUE CLUSTERED') THEN 'REALIGN'

        WHEN type_desc = '--MISSING FOREIGN KEY--' THEN 'CREATE'

        WHEN type_desc = 'XML' THEN '---'

        WHEN is_unique = 1 THEN '---'

        WHEN type_desc = '--NONCLUSTERED--' AND ROW_NUMBER() OVER (PARTITION BY table_name ORDER BY user_total desc, ranking) > 10 THEN 'N/A'

        WHEN ROW_NUMBER() OVER (PARTITION BY table_name ORDER BY user_total desc, ranking) > 10 THEN 'DROP' 

        WHEN user_total = 0 THEN 'DROP' 

        ELSE '---' END as index_action

    ,size_in_mb

    ,mb.buffer_mb

    ,CAST(100*mb.Buffer_MB/NULLIF(size_in_mb,0) AS decimal(6,2)) AS buffer_pct

    ,impact

    ,user_total

    ,user_total_pct

    ,estimated_percent

    ,user_seeks

    ,user_scans

    ,user_lookups

    ,indexed_columns

    ,included_columns

    ,duplicate_indexes

    ,overlapping_indexes

    ,related_foreign_keys

    ,related_foreign_keys_xml

FROM IndexAnalysis ia

    LEFT OUTER JOIN MemoryBuffer mb ON ia.object_id = mb.object_id AND ia.index_id = mb.index_id

ORDER BY SUM(Buffer_MB) OVER (PARTITION BY ia.object_id) desc, table_name, user_total desc, buffer_mb

NOTE: Populating the table name in the function for the @ObjectID variable is optional.  If the value of @ObjectID is NULL then information on all tables is returned.

This version of the DMV query returns the following columns:

  • schema_name: Schema name
  • table_name: Table name
  • index_name: Index Name
  • type_desc: Type of index; either clustered or non-clustered.
  • index_action:  Guidance on how to manage index.  Includes recommendation on whether to CREATE, DROP, or REALIGN the index. 
  • size_in_mb: Then number of MB the index utilizes on disk.
  • buffer_mb: The number of MB in the SQL Server buffer being utilized by the index.
  • buffer_pct: The percentage of an index that is current in the SQL Server buffer.
  • impact: Calculation of impact of a potential index.  This is based on the seeks and scans that the index could have utilized multiplied by average improvement the index would have provided.  This is included only for missing indexes.
  • user_total: Total number of seek, scan, and lookup operations for the index
  • user_total_pct: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing indexes for the table.
  • estimated_percent: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing and potential indexes for the table.  This number is naturally skewed because a seek for potential Index A resulted in another operation on an existing index and both of these operations would be counted.
  • user_seeks: Number of seek operations on the index.
  • user_scans: Number of scan operations on the index.
  • user_lookups: Number of lookup operations on the index.
  • indexed_columns: Columns that are part of the index, missing index or foreign key.
  • included_columns: Columns that are included in the index or missing index.
  • duplicate_indexes: List of Indexes that exist on the table that are identical to the index on this row.
  • overlapping_indexes: List of Indexes that exist on the table that overlap the index on this row.
  • related_foreign_keys: List of foreign keys that are related to the index either as an exact match or covering index.
  • related_foreign_keys_xml: XML document listing foreign keys that are related to the index either as an exact match or covering index.

Keep in mind that the index_action column is only guidance.  It isn’t a 100% or 80% rule.  It is there to provide visual guidance on the interaction of the indexes on the table.

In some cases, I’ve used the REALIGN recommendation to change the clustered index on the table.  But doing this changes the order the table is sorted in and in a lot of cases this can lead to serious fragmentation of the table if the column that appears to be a better clustered index candidate is not ordered similarly to the current clustered index.  In other cases, the REALIGN recommendation has helped to indicate tables where indexes benefitted from adding INCLUDED columns to some of the non-clustered indexes on the table.

I did want to include a comment that I caught on twitter the other day, I’m glad other people are finding this index analysis query that I came up with on the flight to last year’s PASS Summit.

CropperCapture[16]

Speaking of the PASS Summit… if you haven’t started planning for it you definitely should be giving it a good consideration.  Especially when you consider that you can find DBAs and Developers sharing lots of jager there.

Find Query Plans That May Utilize Parallelism


Burnt Server

When ever I go to a new client to assist with performance issues, I inevitably download the Troubleshooting Performance Problems in SLQ Server 2005 white paper.  In my opinion and many others, it is one of those documents that should be at the click of a button to open and I keep it handy on my desktop.

One of the most common wait types that I observe at clients is CXPACKET, known on the street as parallelism.  And since most of these are OLTP database systems, there is a good need for looking into the causes of said wait type.

When I’ve identified that parallelism is the reason for the server running hot and hopefully not catching on fire I start to look at what is running or has run on the server that is utilizing parallelism.  When I first starting to look for execution plans that might use parallelism, I used the query in the white paper (mentioned above) that would return back a list of all of the plans that are currently in the procedure cache that would utilize parallelism if they were selected.

That query is:

SELECT 

    p.*, 

    q.*,

    cp.plan_handle

FROM sys.dm_exec_cached_plans cp

    CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p

    CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) q

WHERE cp.cacheobjtype = 'Compiled Plan' 

AND p.query_plan.value('declare namespace p="http://schemas.microsoft.com/sqlserver/2004/07/showplan";max(//p:RelOp/@Parallel)', 'float') > 0

But recently I got to thinking; and this can be a dangerous game. Do I care about a plan that is only used once? Also, is it better to first look at plans that utilize more CPU than other plans? How do I rank one plan against another plan?

To accomplish this, I turned to the DMV sys.dm_exec_query_stats. This DMV aggregates performance statistics for cached query plans. The trouble with this DMV is that it is the query level and not the procedure level so conditional statements and looping will affect how often some SQL statements are within a procedure. I’ve put together a couple queries that combine the query above with sys.dm_exec_query_stats.

First, I put together the top 50 statements by average CPU utilization that is part of an execution plan that utilizes parallelism. What I like about this query is that I get an exact SQL statement to look at. Unfortunately, this isn’t an exact science because the statement with the high average CPU may not be the statement that is utilizing parallelism. On the other hand, who is going to complain if a high CPU statement without parallelism is tuned?

SELECT TOP 50

    OBJECT_NAME(p.objectid, p.dbid) as [object_name]

    ,qs.execution_count

    ,qs.total_worker_time

    ,qs.total_logical_reads

    ,qs.total_elapsed_time

    ,CASE statement_end_offset WHEN -1 THEN q.text

        ELSE SUBSTRING(q.text, statement_start_offset/2, (statement_end_offset-statement_start_offset)/2) END as sql_statement

    ,p.query_plan

    ,q.text

    ,cp.plan_handle

FROM sys.dm_exec_query_stats qs

    INNER JOIN sys.dm_exec_cached_plans cp ON qs.plan_handle = cp.plan_handle

    CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p

    CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) as q

WHERE cp.cacheobjtype = 'Compiled Plan' 

AND p.query_plan.value('declare namespace p="http://schemas.microsoft.com/sqlserver/2004/07/showplan";max(//p:RelOp/@Parallel)', 'float') > 0

ORDER BY qs.total_worker_time/qs.execution_count DESC

The second query I have here ignores the statement level detail of sys.dm_exec_query_stats and summarizes the views at the plan_handle level. This presents some problems though since execution counts amongst all query statements isn’t always the same within a plan and which number would be best to use? Minimum executions? Maximum executions? Average executions? It all depends which side you want to error on.  For the script, though, I went with maximum executions.

WITH cQueryStats

AS (

    SELECT qs.plan_handle

        ,MAX(qs.execution_count) as execution_count

        ,SUM(qs.total_worker_time) as total_worker_time

        ,SUM(qs.total_logical_reads) as total_logical_reads

        ,SUM(qs.total_elapsed_time) as total_elapsed_time

    FROM sys.dm_exec_query_stats qs

    GROUP BY qs.plan_handle

)

SELECT TOP 50

    OBJECT_NAME(p.objectid, p.dbid) as [object_name] ,qs.execution_count

    ,qs.total_worker_time

    ,qs.total_logical_reads

    ,qs.total_elapsed_time

    ,p.query_plan

    ,q.text

    ,cp.plan_handle

FROM cQueryStats qs

    INNER JOIN sys.dm_exec_cached_plans cp ON qs.plan_handle = cp.plan_handle

    CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p

    CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) as q

WHERE cp.cacheobjtype = 'Compiled Plan' 

AND p.query_plan.value('declare namespace p="http://schemas.microsoft.com/sqlserver/2004/07/showplan";max(//p:RelOp/@Parallel)', 'float') > 0

ORDER BY qs.total_worker_time/qs.execution_count DESC

If you are already on SQL Server 2008, you do have the benefit of having the DMV sys.dm_exec_procedure_stats at your disposal.  Using this DMV will solve the problem of not having accurate execution information at the procedure level.  For those blesses with SQL Server 2008, you can use this query to identify execution statistics for stored plans that utilize parallelism.

SELECT TOP 50

    OBJECT_NAME(p.objectid, p.dbid) as [object_name] 

    ,ps.total_worker_time/ps.execution_count as avg_worker_time

    ,ps.execution_count

    ,ps.total_worker_time

    ,ps.total_logical_reads

    ,ps.total_elapsed_time

    ,p.query_plan

    ,q.text

    ,cp.plan_handle

FROM sys.dm_exec_procedure_stats ps

    INNER JOIN sys.dm_exec_cached_plans cp ON ps.plan_handle = cp.plan_handle

    CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p

    CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) as q

WHERE cp.cacheobjtype = 'Compiled Plan' 

AND p.query_plan.value('declare namespace p="http://schemas.microsoft.com/sqlserver/2004/07/showplan";max(//p:RelOp/@Parallel)', 'float') > 0

ORDER BY ps.total_worker_time/ps.execution_count DESC

If anyone has any queries that they use to address and research parallelism in their environments, I would be more than happy to see them.

Analyze This – Analyze Your Indexes – Part 4

This is the fourth post in which I am walking through a query that I use to analyze indexes.  Thus far I have added to the following to the query that I use:

  1. Combine existing index statistics and missing indexes into a single output (blog)
  2. Find existing duplicate and overlapping indexes (blog)
  3. Find the relationship between indexes and foreign keys (blog)man with blinders

A couple issues that were previously brought up have been updated:

  • Foreign keys with multiple columns were causing some duplication.  Yeah, I knew about that but my horse blinders allowed me to forget about it when I shouldn’t have.
  • For the duplicate and overlapping index logic, index_column_id was used to order the columns and it should have been the key_ordinal column.

In this next post I want to add more information about how the indexes are currently being used.  Specifically, what is the approximate size of the index and how much of the index is currently in memory.

Ce3k-mashedThis means something…

Why would I be interested in the size of my indexes and how much is in memory?  There are a few reasons for doing this

To start with, indexes take up space.  That should be a no-brainer.  But knowing how much space an index is utilizing can provide enlightenment into how the space is distributed across all of your indexes.  Space is relatively cheap these days but throwing a few gigs of disk at a logging table might not be a good use of disk.  Especially if it is over indexed or has some poor data type choices.

Now just knowing how large and index is doesn’t always provide the full picture.  Because data that is at rest only consume disk resources.  Determining the amount of the table that is the buffer provides insight into how important that data in the table/index is to the rest of the database.

I get into conversations with client’s often about what data is most important in their database systems and depending on who I talk to this view can be quite different.  In my mind though, the tables/indexes that are in memory are the most important.  They are the most important because the applications and services using the database felt they were important (because of queries and donuts) to load them from disk to memory.

But what about when these aren’t the important tables/indexes in the database?  That means there is a lot of work to do to add appropriate indexes, tune up some of the queries, or archive off some inappropriate data.

I mention the last one because of a conversation I ran into a while back.  A friend had shared with me a situation where the largest tables in memory across one of his server’s were the msdb backup tables.  A few years of backup entries for every 15 minutes across a number of databases wasn’t really needed.  But the backup jobs forced them up there every 15 when the backups occurred.  Loading all of that data into memory caused other data to be pushed out and caused some of the memory pressure that had been occurring.

Onto the next version…

Today’s version of the will include both the current size of the index and the portion of the index that is currently in memory.  To determine the size of the index, I used the following post.  But for the amount of the index I had to look around a bit and found an post on SQLTeam that aggregated the information that I was looking for.

The version below combines the new components and the fixes from above:

DECLARE @ObjectID int

SELECT @ObjectID = OBJECT_ID('Person.Contact')

 

;WITH IndexSize

AS(

    SELECT object_id

        ,index_id

        ,CAST((SUM(reserved_page_count) * CAST(8 as float))/1024 as decimal(12,2)) as size_in_mb

    FROM sys.dm_db_partition_stats

    GROUP BY object_id, index_id

)

,preIndexAnalysis

AS (

    SELECT 

        OBJECT_SCHEMA_NAME(t.object_id) as schema_name

        ,t.name as table_name

        ,COALESCE(i.name, 'N/A') as index_name

        ,CASE WHEN i.is_unique = 1 THEN 'UNIQUE ' ELSE '' END + i.type_desc as type_desc

        ,iz.size_in_mb

        ,NULL as impact

        ,ROW_NUMBER() 

            OVER (PARTITION BY i.object_id ORDER BY i.is_primary_key desc, ius.user_seeks + ius.user_scans + ius.user_lookups desc) as ranking

        ,ius.user_seeks + ius.user_scans + ius.user_lookups as user_total

        ,COALESCE(CAST(100 * (ius.user_seeks + ius.user_scans + ius.user_lookups)

            /(NULLIF(SUM(ius.user_seeks + ius.user_scans + ius.user_lookups) 

            OVER(PARTITION BY i.object_id), 0) * 1.) as decimal(6,2)),0) as user_total_pct

        ,ius.user_seeks

        ,ius.user_scans

        ,ius.user_lookups

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.index_columns ic

                    INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 0

                ORDER BY key_ordinal ASC

                FOR XML PATH('')), 1, 2, '') AS indexed_columns

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.index_columns ic

                    INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 1

                ORDER BY key_ordinal ASC

                FOR XML PATH('')), 1, 2, '') AS included_columns

        ,i.object_id

        ,i.index_id

        ,(SELECT QUOTENAME(ic.column_id,'(')

                FROM sys.index_columns ic

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 0

                ORDER BY key_ordinal ASC

                FOR XML PATH('')) AS indexed_columns_compare

        ,COALESCE((SELECT QUOTENAME(ic.column_id, '(')

                FROM sys.index_columns ic

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 1

                ORDER BY key_ordinal ASC

                FOR XML PATH('')), SPACE(0)) AS included_columns_compare

    FROM sys.tables t

        INNER JOIN sys.indexes i ON t.object_id = i.object_id

        LEFT OUTER JOIN sys.dm_db_index_usage_stats ius ON i.object_id = ius.object_id AND i.index_id = ius.index_id AND ius.database_id = db_id()

        INNER JOIN IndexSize iz ON i.object_id = iz.object_id AND i.index_id = iz.index_id

    WHERE t.object_id = @ObjectID OR @ObjectID IS NULL

    UNION ALL

    SELECT 

        OBJECT_SCHEMA_NAME(mid.object_id) as schema_name

        ,OBJECT_NAME(mid.object_id) as table_name

        ,'--MISSING--'

        ,'--NONCLUSTERED--'

        ,NULL

        ,(migs.user_seeks + migs.user_scans) * migs.avg_user_impact as impact

        ,0 as ranking

        ,migs.user_seeks + migs.user_scans as user_total

        ,NULL as user_total_pct

        ,migs.user_seeks 

        ,migs.user_scans

        ,0 as user_lookups

        ,COALESCE(equality_columns + ', ', SPACE(0)) + COALESCE(inequality_columns, SPACE(0)) as indexed_columns

        ,included_columns

        ,mid.object_id

        ,NULL

        ,NULL

        ,NULL

    FROM sys.dm_db_missing_index_details mid

        INNER JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle

        INNER JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle

    WHERE database_id = db_id()

    AND (mid.object_id = @ObjectID OR @ObjectID IS NULL)

)

,ForeignKeys

AS (

    SELECT fk.name + '|PARENT' AS name

        ,fkc.parent_object_id AS object_id

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)