Analyze This – Analyze Your Indexes – Part 5

Posted by & filed under , , , , , .

This is the fifth post in which I am walking through a query that I use to analyze indexes.  Thus far I have added to the following to the query that I use:

  1. Combine existing index statistics and missing indexes into a single output
  2. Find existing duplicate and overlapping indexes
  3. Find the relationship between indexes and foreign keys
  4. Added size and current me

While testing the current version of the index analysis query I found an issue with the foreign key portion of the filtering that has been fixed.  Where the code was:


WHERE fkc.constraint_column_id = 1
AND (fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL)

It should have been:


WHERE fkc.constraint_column_id = 1
AND (fkc.referenced_object_id = @ObjectID OR @ObjectID IS NULL)

For the next version of the index analysis query I wanted to build in some guidance on how to handle indexes on the table.  There is a couple things I wanted to be able to tell at a glance without much thought.

  1. When could clustered indexes be realigned on a table to improve performance on a table.  This idea came about after reading this article on MSSQLTips.  The logic used evaluates user_lookups against user_seeks.
  2. When the index count exceed 10 indexes on the table flag the indexes to be dropped.  At some point, I read an index best practices the provided guidance around 5-10 indexes per table in OLTP environments.  If anyone has a link to an article on this please let me know, the one I read I lost track of.
  3. Any indexes that could be built for foreign keys should be flagged to be created.

This information is found in a new columns called index_action.  The new version of the query is as follows:


DECLARE @ObjectID int
SELECT @ObjectID = OBJECT_ID('')

;WITH IndexSize
AS(
SELECT object_id
,index_id
,CAST((SUM(reserved_page_count) * CAST(8 as float))/1024 as decimal(12,2)) as size_in_mb
FROM sys.dm_db_partition_stats
GROUP BY object_id, index_id
)
,preIndexAnalysis
AS (
SELECT
OBJECT_SCHEMA_NAME(t.object_id) as schema_name
,t.name as table_name
,COALESCE(i.name, 'N/A') as index_name
,i.is_unique
,CASE WHEN i.is_unique = 1 THEN 'UNIQUE ' ELSE '' END + i.type_desc as type_desc
,iz.size_in_mb
,NULL as impact
,ROW_NUMBER()
OVER (PARTITION BY i.object_id ORDER BY i.is_primary_key desc, ius.user_seeks + ius.user_scans + ius.user_lookups desc) as ranking
,ius.user_seeks + ius.user_scans + ius.user_lookups as user_total
,COALESCE(CAST(100 * (ius.user_seeks + ius.user_scans + ius.user_lookups)
/(NULLIF(SUM(ius.user_seeks + ius.user_scans + ius.user_lookups)
OVER(PARTITION BY i.object_id), 0) * 1.) as decimal(6,2)),0) as user_total_pct
,ius.user_seeks
,ius.user_scans
,ius.user_lookups
,STUFF((SELECT ', ' + QUOTENAME(c.name)
FROM sys.index_columns ic
INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
WHERE i.object_id = ic.object_id
AND i.index_id = ic.index_id
AND is_included_column = 0
ORDER BY key_ordinal ASC
FOR XML PATH('')), 1, 2, '') AS indexed_columns
,STUFF((SELECT ', ' + QUOTENAME(c.name)
FROM sys.index_columns ic
INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
WHERE i.object_id = ic.object_id
AND i.index_id = ic.index_id
AND is_included_column = 1
ORDER BY key_ordinal ASC
FOR XML PATH('')), 1, 2, '') AS included_columns
,i.object_id
,i.index_id
,(SELECT QUOTENAME(ic.column_id,'(')
FROM sys.index_columns ic
WHERE i.object_id = ic.object_id
AND i.index_id = ic.index_id
AND is_included_column = 0
ORDER BY key_ordinal ASC
FOR XML PATH('')) AS indexed_columns_compare
,COALESCE((SELECT QUOTENAME(ic.column_id, '(')
FROM sys.index_columns ic
WHERE i.object_id = ic.object_id
AND i.index_id = ic.index_id
AND is_included_column = 1
ORDER BY key_ordinal ASC
FOR XML PATH('')), SPACE(0)) AS included_columns_compare
FROM sys.tables t
INNER JOIN sys.indexes i ON t.object_id = i.object_id
LEFT OUTER JOIN sys.dm_db_index_usage_stats ius ON i.object_id = ius.object_id AND i.index_id = ius.index_id AND ius.database_id = db_id()
INNER JOIN IndexSize iz ON i.object_id = iz.object_id AND i.index_id = iz.index_id
WHERE t.object_id = @ObjectID OR @ObjectID IS NULL
UNION ALL
SELECT
OBJECT_SCHEMA_NAME(mid.object_id) as schema_name
,OBJECT_NAME(mid.object_id) as table_name
,'--MISSING--'
,NULL
,'--NONCLUSTERED--'
,NULL
,(migs.user_seeks + migs.user_scans) * migs.avg_user_impact as impact
,0 as ranking
,migs.user_seeks + migs.user_scans as user_total
,NULL as user_total_pct
,migs.user_seeks
,migs.user_scans
,0 as user_lookups
,COALESCE(equality_columns + ', ', SPACE(0)) + COALESCE(inequality_columns, SPACE(0)) as indexed_columns
,included_columns
,mid.object_id
,NULL
,NULL
,NULL
FROM sys.dm_db_missing_index_details mid
INNER JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle
INNER JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle
WHERE database_id = db_id()
AND (mid.object_id = @ObjectID OR @ObjectID IS NULL)
)
,ForeignKeys
AS (
SELECT fk.name + '|PARENT' AS name
,fkc.parent_object_id AS object_id
,STUFF((SELECT ', ' + QUOTENAME(c.name)
FROM sys.foreign_key_columns ifkc
INNER JOIN sys.columns c ON ifkc.parent_object_id = c.object_id AND ifkc.parent_column_id = c.column_id
WHERE fk.object_id = ifkc.constraint_object_id
ORDER BY ifkc.constraint_column_id
FOR XML PATH('')), 1, 2, '') AS fk_columns
,(SELECT QUOTENAME(ifkc.parent_column_id,'(')
FROM sys.foreign_key_columns ifkc
WHERE fk.object_id = ifkc.constraint_object_id
ORDER BY ifkc.constraint_column_id
FOR XML PATH('')) AS fk_columns_compare
FROM sys.foreign_keys fk
INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id
WHERE fkc.constraint_column_id = 1
AND (fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL)
UNION ALL
SELECT fk.name + '|REFERENCED' as name
,fkc.referenced_object_id AS object_id
,STUFF((SELECT ', ' + QUOTENAME(c.name)
FROM sys.foreign_key_columns ifkc
INNER JOIN sys.columns c ON ifkc.referenced_object_id = c.object_id AND ifkc.referenced_column_id = c.column_id
WHERE fk.object_id = ifkc.constraint_object_id
ORDER BY ifkc.constraint_column_id
FOR XML PATH('')), 1, 2, '') AS fk_columns
,(SELECT QUOTENAME(ifkc.referenced_column_id,'(')
FROM sys.foreign_key_columns ifkc
WHERE fk.object_id = ifkc.constraint_object_id
ORDER BY ifkc.constraint_column_id
FOR XML PATH('')) AS fk_columns_compare
FROM sys.foreign_keys fk
INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id
WHERE fkc.constraint_column_id = 1
AND (fkc.referenced_object_id = @ObjectID OR @ObjectID IS NULL)
)
,MemoryBuffer
AS (
SELECT
obj.object_id
,obj.index_id
,COUNT(*)AS Buffered_Page_Count
,CONVERT(decimal(12,2), CAST(COUNT(*) as bigint)*CAST(8 as float)/1024) as buffer_mb
FROM sys.dm_os_buffer_descriptors AS bd
INNER JOIN
(
SELECT object_name(object_id) AS name
,index_id ,allocation_unit_id, object_id
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.hobt_id
AND (au.type = 1 OR au.type = 3)
UNION ALL
SELECT object_name(object_id) AS name
,index_id, allocation_unit_id, object_id
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.hobt_id
AND au.type = 2
) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id
WHERE database_id = db_id()
GROUP BY obj.object_id ,obj.index_id
)
, IndexAnalysis
AS (
SELECT ia.object_id
,ia.index_id
,schema_name
,table_name
,index_name
,is_unique
,type_desc
,size_in_mb
,impact
,ranking
,user_total
,user_total_pct
,CAST(100 * (user_seeks + user_scans + user_lookups)
/(NULLIF(SUM(user_seeks + user_scans + user_lookups)
OVER(PARTITION BY schema_name, table_name), 0) * 1.) as decimal(6,2)) as estimated_percent
,user_seeks
,user_scans
,user_lookups
,indexed_columns
,included_columns
,STUFF((SELECT ', ' + index_name AS [data()]
FROM preIndexAnalysis iia
WHERE ia.object_id = iia.object_id
AND ia.index_id <> iia.index_id
AND ia.indexed_columns_compare = iia.indexed_columns_compare
AND ia.included_columns_compare = iia.included_columns_compare
FOR XML PATH('')), 1, 2, '') AS duplicate_indexes
,STUFF((SELECT ', ' + index_name AS [data()]
FROM preIndexAnalysis iia
WHERE ia.object_id = iia.object_id
AND ia.index_id <> iia.index_id
AND (ia.indexed_columns_compare LIKE iia.indexed_columns_compare + '%'
OR iia.indexed_columns_compare LIKE ia.indexed_columns_compare + '%')
AND ia.indexed_columns_compare <> iia.indexed_columns_compare
FOR XML PATH('')), 1, 2, '') AS overlapping_indexes
,STUFF((SELECT ', ' + name AS [data()]
FROM ForeignKeys ifk
WHERE ifk.object_id = ia.object_id
AND ia.indexed_columns_compare LIKE ifk.fk_columns_compare + '%'
FOR XML PATH('')), 1, 2, '') AS related_foreign_keys
,CAST((SELECT name
FROM ForeignKeys
WHERE ForeignKeys.object_id = ia.object_id
AND ia.indexed_columns_compare LIKE ForeignKeys.fk_columns_compare + '%'
FOR XML AUTO) as xml) AS related_foreign_keys_xml
FROM preIndexAnalysis ia
UNION ALL
SELECT fk.object_id
,NULL
,OBJECT_SCHEMA_NAME(fk.object_id) AS schema_name
,OBJECT_NAME(fk.object_id) AS table_name
,fk.name AS index_name
,NULL
,'--MISSING FOREIGN KEY--'
,NULL
,NULL
,9999
,NULL
,NULL
,NULL
,NULL
,NULL
,NULL
,fk.fk_columns
,NULL
,NULL
,NULL
,NULL
,NULL
FROM ForeignKeys fk
LEFT OUTER JOIN preIndexAnalysis ia
ON fk.object_id = ia.object_id
AND ia.indexed_columns_compare LIKE fk.fk_columns_compare + '%'
WHERE ia.index_name IS NULL
)
SELECT schema_name
,table_name
,index_name
,type_desc
,CASE WHEN user_lookups > user_seeks AND type_desc IN ('CLUSTERED', 'HEAP', 'UNIQUE CLUSTERED') THEN 'REALIGN'
WHEN type_desc = '--MISSING FOREIGN KEY--' THEN 'CREATE'
WHEN type_desc = 'XML' THEN '---'
WHEN is_unique = 1 THEN '---'
WHEN type_desc = '--NONCLUSTERED--' AND ROW_NUMBER() OVER (PARTITION BY table_name ORDER BY user_total desc, ranking) > 10 THEN 'N/A'
WHEN ROW_NUMBER() OVER (PARTITION BY table_name ORDER BY user_total desc, ranking) > 10 THEN 'DROP'
WHEN user_total = 0 THEN 'DROP'
ELSE '---' END as index_action
,size_in_mb
,mb.buffer_mb
,CAST(100*mb.Buffer_MB/NULLIF(size_in_mb,0) AS decimal(6,2)) AS buffer_pct
,impact
,user_total
,user_total_pct
,estimated_percent
,user_seeks
,user_scans
,user_lookups
,indexed_columns
,included_columns
,duplicate_indexes
,overlapping_indexes
,related_foreign_keys
,related_foreign_keys_xml
FROM IndexAnalysis ia
LEFT OUTER JOIN MemoryBuffer mb ON ia.object_id = mb.object_id AND ia.index_id = mb.index_id
ORDER BY SUM(Buffer_MB) OVER (PARTITION BY ia.object_id) desc, table_name, user_total desc, buffer_mb

NOTE: Populating the table name in the function for the @ObjectID variable is optional.  If the value of @ObjectID is NULL then information on all tables is returned.

This version of the DMV query returns the following columns:

  • schema_name: Schema name
  • table_name: Table name
  • index_name: Index Name
  • type_desc: Type of index; either clustered or non-clustered.
  • index_action: Guidance on how to manage index.  Includes recommendation on whether to CREATE, DROP, or REALIGN the index.
  • size_in_mb: Then number of MB the index utilizes on disk.
  • buffer_mb: The number of MB in the SQL Server buffer being utilized by the index.
  • buffer_pct: The percentage of an index that is current in the SQL Server buffer.
  • impact: Calculation of impact of a potential index.  This is based on the seeks and scans that the index could have utilized multiplied by average improvement the index would have provided.  This is included only for missing indexes.
  • user_total: Total number of seek, scan, and lookup operations for the index
  • user_total_pct: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing indexes for the table.
  • estimated_percent: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing and potential indexes for the table.  This number is naturally skewed because a seek for potential Index A resulted in another operation on an existing index and both of these operations would be counted.
  • user_seeks: Number of seek operations on the index.
  • user_scans: Number of scan operations on the index.
  • user_lookups: Number of lookup operations on the index.
  • indexed_columns: Columns that are part of the index, missing index or foreign key.
  • included_columns: Columns that are included in the index or missing index.
  • duplicate_indexes: List of Indexes that exist on the table that are identical to the index on this row.
  • overlapping_indexes: List of Indexes that exist on the table that overlap the index on this row.
  • related_foreign_keys: List of foreign keys that are related to the index either as an exact match or covering index.
  • related_foreign_keys_xml: XML document listing foreign keys that are related to the index either as an exact match or covering index.

Keep in mind that the index_action column is only guidance.  It isn’t a 100% or 80% rule.  It is there to provide visual guidance on the interaction of the indexes on the table.

In some cases, I’ve used the REALIGN recommendation to change the clustered index on the table.  But doing this changes the order the table is sorted in and in a lot of cases this can lead to serious fragmentation of the table if the column that appears to be a better clustered index candidate is not ordered similarly to the current clustered index.  In other cases, the REALIGN recommendation has helped to indicate tables where indexes benefitted from adding INCLUDED columns to some of the non-clustered indexes on the table.

I did want to include a comment that I caught on twitter the other day, I’m glad other people are finding this index analysis query that I came up with on the flight to last year’s PASS Summit.

Twitter Compliment

Speaking of the PASS Summit… if you haven’t started planning for it you definitely should be giving it a good consideration.  Especially when you consider that you can find DBAs and Developers sharing lots of jager there.