Articles from April 2009



Analyze This – Analyze Your Indexes – Part 6

This is the sixth and last for now post in which I’m going through a query that can be used to analyze indexes.  Thus far the following has been added to the query:

  1. Combine existing index statistics and missing indexes into a single output
  2. Find existing duplicate and overlapping indexes
  3. Find the relationship between indexes and foreign keys
  4. Added size and current memory utilization
  5. Add in index analysis guidelines

If you’ve used the last few versions of this query you may have noticed that performance against databases with a large number of indexes wasn’t all that great.  Also, the query didn’t account for partitions which I had planned to resolve at a later date.  Oh, look – it’s a later date.

In the first few versions of the index analysis query, new features to the query were added through the use of CTEs.  The trouble with the way the query was put together was that it caused it to choke when performing as the volume of indexes being reviewed increased.  In a few sample executions, the performance has improved by an order of magnitude.  In one case the execution went from almost 3 minutes to 20 seconds.

This version of the index analysis query also now works with SQL Server 2005 RTM.  There was a few functions being used that were added in SQL Server 2005 SP 2.  Those have been removed to allow for compatibility back to SQL Server 2005 RTM.

A new column that flags whether of not a table has a unique index on it was added.  This was added based on a request in a comment a couple posts ago.

Besides fixing the items listed above, there was a need to review indexes to check for blocking.  To do this the DMV sys.dm_db_index_operational_stats should be used.  This DMV provides I/O, locking, latching, and access method activity for each partition of a table or index.  Only blocking statistics have been with the index analysis at this time.

The last thing that could have been added to the query was fragmentation information from sys.dm_db_index_physical_stats.  This was not included because the execution of this DMV is invasive even when using SIMPLE mode.

The new version of the query is as follows:

DECLARE @ObjectID int
,@DB_ID int

SELECT @ObjectID = OBJECT_ID('')
,@DB_ID = db_id()

IF OBJECT_ID('tempdb..#IndexBaseLine') IS NOT NULL
DROP TABLE #IndexBaseLine

CREATE TABLE #IndexBaseLine
(
row_id int IDENTITY(1,1)
,index_action varchar(10)
,schema_id int
,schema_name sysname
,object_id int
,table_name sysname
,index_id int
,index_name nvarchar(128)
,is_unique bit DEFAULT(0)
,has_unique bit DEFAULT(0)
,type_desc nvarchar(67)
,partition_number int
,reserved_page_count bigint
,size_in_mb decimal(12, 2)
,buffered_page_count int
,buffer_mb decimal(12, 2)
,pct_in_buffer decimal(12, 2)
,table_buffer_mb decimal(12, 2)
,row_count bigint
,impact int
,existing_ranking bigint
,user_total bigint
,user_total_pct decimal(6, 2)
,estimated_user_total_pct decimal(6, 2)
,user_seeks bigint
,user_scans bigint
,user_lookups bigint
,row_lock_count bigint
,row_lock_wait_count bigint
,row_lock_wait_in_ms bigint
,row_block_pct decimal(6, 2)
,avg_row_lock_waits_ms bigint
,indexed_columns nvarchar(max)
,included_columns nvarchar(max)
,indexed_columns_compare nvarchar(max)
,included_columns_compare nvarchar(max)
,duplicate_indexes nvarchar(max)
,overlapping_indexes nvarchar(max)
,related_foreign_keys nvarchar(max)
,related_foreign_keys_xml xml
)

IF OBJECT_ID('tempdb..#ForeignKeys') IS NOT NULL
DROP TABLE #ForeignKeys

CREATE TABLE #ForeignKeys
(
foreign_key_name sysname
,object_id int
,fk_columns nvarchar(max)
,fk_columns_compare nvarchar(max)
)

;WITH AllocationUnits
AS (
SELECT p.object_id
,p.index_id
,p.partition_number
,au.allocation_unit_id
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3)
UNION ALL
SELECT p.object_id
,p.index_id
,p.partition_number
,au.allocation_unit_id
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id AND au.type = 2
),MemoryBuffer
AS (
SELECT au.object_id
,au.index_id
,au.partition_number
,COUNT(*)AS buffered_page_count
,CONVERT(decimal(12,2), CAST(COUNT(*) as bigint)*CAST(8 as float)/1024) as buffer_mb
FROM sys.dm_os_buffer_descriptors AS bd
INNER JOIN AllocationUnits au ON bd.allocation_unit_id = au.allocation_unit_id
WHERE bd.database_id = db_id()
GROUP BY au.object_id, au.index_id, au.partition_number
)
INSERT INTO #IndexBaseLine
(schema_id, schema_name, object_id, table_name, index_id, index_name, is_unique, type_desc, partition_number, reserved_page_count, size_in_mb, buffered_page_count, buffer_mb, pct_in_buffer, row_count, existing_ranking, user_total, user_total_pct, user_seeks, user_scans, user_lookups, row_lock_count, row_lock_wait_count, row_lock_wait_in_ms, row_block_pct, avg_row_lock_waits_ms, indexed_columns, included_columns, indexed_columns_compare, included_columns_compare)
SELECT s.schema_id
,s.name as schema_name
,t.object_id
,t.name as table_name
,i.index_id
,COALESCE(i.name, 'N/A') as index_name
,i.is_unique
,CASE WHEN i.is_unique = 1 THEN 'UNIQUE ' ELSE '' END + i.type_desc as type_desc
,ps.partition_number
,ps.reserved_page_count
,CAST(reserved_page_count * CAST(8 as float) / 1024 as decimal(12,2)) as size_in_mb
,mb.buffered_page_count
,mb.buffer_mb
,CAST(100*buffer_mb/NULLIF(CAST(reserved_page_count * CAST(8 as float) / 1024 as decimal(12,2)),0) AS decimal(12,2)) as pct_in_buffer
,row_count
,ROW_NUMBER()
OVER (PARTITION BY i.object_id ORDER BY i.is_primary_key desc, ius.user_seeks + ius.user_scans + ius.user_lookups desc) as existing_ranking
,ius.user_seeks + ius.user_scans + ius.user_lookups as user_total
,COALESCE(CAST(100 * (ius.user_seeks + ius.user_scans + ius.user_lookups)
/(NULLIF(SUM(ius.user_seeks + ius.user_scans + ius.user_lookups)
OVER(PARTITION BY i.object_id), 0) * 1.) as decimal(6,2)),0) as user_total_pct
,ius.user_seeks
,ius.user_scans
,ius.user_lookups
,ios.row_lock_count
,ios.row_lock_wait_count
,ios.row_lock_wait_in_ms
,CAST(100.0 * ios.row_lock_wait_count/NULLIF(ios.row_lock_count, 0) AS decimal(12,2)) AS row_block_pct
,CAST(1. * ios.row_lock_wait_in_ms /NULLIF(ios.row_lock_wait_count, 0) AS decimal(12,2)) AS avg_row_lock_waits_ms
,STUFF((SELECT ', ' + QUOTENAME(c.name)
FROM sys.index_columns ic
INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
WHERE i.object_id = ic.object_id
AND i.index_id = ic.index_id
AND is_included_column = 0
ORDER BY key_ordinal ASC
FOR XML PATH('')), 1, 2, '') AS indexed_columns
,STUFF((SELECT ', ' + QUOTENAME(c.name)
FROM sys.index_columns ic
INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
WHERE i.object_id = ic.object_id
AND i.index_id = ic.index_id
AND is_included_column = 1
ORDER BY key_ordinal ASC
FOR XML PATH('')), 1, 2, '') AS included_columns
,(SELECT QUOTENAME(ic.column_id,'(')
FROM sys.index_columns ic
WHERE i.object_id = ic.object_id
AND i.index_id = ic.index_id
AND is_included_column = 0
ORDER BY key_ordinal ASC
FOR XML PATH('')) AS indexed_columns_compare
,COALESCE((SELECT QUOTENAME(ic.column_id, '(')
FROM sys.index_columns ic
WHERE i.object_id = ic.object_id
AND i.index_id = ic.index_id
AND is_included_column = 1
ORDER BY key_ordinal ASC
FOR XML PATH('')), SPACE(0)) AS included_columns_compare
FROM sys.tables t
INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
INNER JOIN sys.indexes i ON t.object_id = i.object_id
INNER JOIN sys.dm_db_partition_stats ps ON i.object_id = ps.object_id AND i.index_id = ps.index_id
LEFT OUTER JOIN sys.dm_db_index_usage_stats ius ON i.object_id = ius.object_id AND i.index_id = ius.index_id AND ius.database_id = db_id()
LEFT OUTER JOIN sys.dm_db_index_operational_stats(@DB_ID, NULL, NULL, NULL) ios ON ps.object_id = ios.object_id AND ps.index_id = ios.index_id AND ps.partition_number = ios.partition_number
LEFT OUTER JOIN MemoryBuffer mb ON ps.object_id = mb.object_id AND ps.index_id = mb.index_id AND ps.partition_number = mb.partition_number
WHERE t.object_id = @ObjectID OR @ObjectID IS NULL

INSERT INTO #IndexBaseLine
(schema_id, schema_name, object_id, table_name, index_name, type_desc, impact, existing_ranking, user_total, user_seeks, user_scans, user_lookups, indexed_columns, included_columns)
SELECT s.schema_id
,s.name AS schema_name
,t.object_id
,t.name AS table_name
,'--MISSING--' AS index_name
,'--NONCLUSTERED--' AS type_desc
,(migs.user_seeks + migs.user_scans) * migs.avg_user_impact as impact
,0 AS existing_ranking
,migs.user_seeks + migs.user_scans as user_total
,migs.user_seeks
,migs.user_scans
,0 as user_lookups
,COALESCE(equality_columns + ', ', SPACE(0)) + COALESCE(inequality_columns, SPACE(0)) as indexed_columns
,included_columns
FROM sys.tables t
INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
INNER JOIN sys.dm_db_missing_index_details mid ON t.object_id = mid.object_id
INNER JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle
INNER JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle
WHERE mid.database_id = db_id()
AND (mid.object_id = @ObjectID OR @ObjectID IS NULL)

INSERT INTO #ForeignKeys
(foreign_key_name, object_id, fk_columns, fk_columns_compare)
SELECT fk.name + '|PARENT' AS foreign_key_name
,fkc.parent_object_id AS object_id
,STUFF((SELECT ', ' + QUOTENAME(c.name)
FROM sys.foreign_key_columns ifkc
INNER JOIN sys.columns c ON ifkc.parent_object_id = c.object_id AND ifkc.parent_column_id = c.column_id
WHERE fk.object_id = ifkc.constraint_object_id
ORDER BY ifkc.constraint_column_id
FOR XML PATH('')), 1, 2, '') AS fk_columns
,(SELECT QUOTENAME(ifkc.parent_column_id,'(')
FROM sys.foreign_key_columns ifkc
WHERE fk.object_id = ifkc.constraint_object_id
ORDER BY ifkc.constraint_column_id
FOR XML PATH('')) AS fk_columns_compare
FROM sys.foreign_keys fk
INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id
WHERE fkc.constraint_column_id = 1
AND (fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL)
UNION ALL
SELECT fk.name + '|REFERENCED' as foreign_key_name
,fkc.referenced_object_id AS object_id
,STUFF((SELECT ', ' + QUOTENAME(c.name)
FROM sys.foreign_key_columns ifkc
INNER JOIN sys.columns c ON ifkc.referenced_object_id = c.object_id AND ifkc.referenced_column_id = c.column_id
WHERE fk.object_id = ifkc.constraint_object_id
ORDER BY ifkc.constraint_column_id
FOR XML PATH('')), 1, 2, '') AS fk_columns
,(SELECT QUOTENAME(ifkc.referenced_column_id,'(')
FROM sys.foreign_key_columns ifkc
WHERE fk.object_id = ifkc.constraint_object_id
ORDER BY ifkc.constraint_column_id
FOR XML PATH('')) AS fk_columns_compare
FROM sys.foreign_keys fk
INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id
WHERE fkc.constraint_column_id = 1
AND (fkc.referenced_object_id = @ObjectID OR @ObjectID IS NULL)

UPDATE ibl
SET duplicate_indexes = STUFF((SELECT ', ' + index_name AS [data()]
FROM #IndexBaseLine iibl
WHERE ibl.object_id = iibl.object_id
AND ibl.index_id <> iibl.index_id
AND ibl.indexed_columns_compare = iibl.indexed_columns_compare
AND ibl.included_columns_compare = iibl.included_columns_compare
FOR XML PATH('')), 1, 2, '')
,overlapping_indexes = STUFF((SELECT ', ' + index_name AS [data()]
FROM #IndexBaseLine iibl
WHERE ibl.object_id = iibl.object_id
AND ibl.index_id <> iibl.index_id
AND (ibl.indexed_columns_compare LIKE iibl.indexed_columns_compare + '%'
OR iibl.indexed_columns_compare LIKE ibl.indexed_columns_compare + '%')
AND ibl.indexed_columns_compare <> iibl.indexed_columns_compare
FOR XML PATH('')), 1, 2, '')
,related_foreign_keys = STUFF((SELECT ', ' + foreign_key_name AS [data()]
FROM #ForeignKeys ifk
WHERE ifk.object_id = ibl.object_id
AND ibl.indexed_columns_compare LIKE ifk.fk_columns_compare + '%'
FOR XML PATH('')), 1, 2, '')
,related_foreign_keys_xml = CAST((SELECT foreign_key_name
FROM #ForeignKeys ForeignKeys
WHERE ForeignKeys.object_id = ibl.object_id
AND ibl.indexed_columns_compare LIKE ForeignKeys.fk_columns_compare + '%'
FOR XML AUTO) as xml)
FROM #IndexBaseLine ibl

INSERT INTO #IndexBaseLine
(schema_id, schema_name, object_id, table_name, index_name, type_desc, existing_ranking, indexed_columns)
SELECT s.schema_id
,s.name AS schema_name
,t.object_id
,t.name AS table_name
,fk.foreign_key_name AS index_name
,'--MISSING FOREIGN KEY--' as type_desc
,9999
,fk.fk_columns
FROM sys.tables t
INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
INNER JOIN #ForeignKeys fk ON t.object_id = fk.object_id
LEFT OUTER JOIN #IndexBaseLine ia ON fk.object_id = ia.object_id AND ia.indexed_columns_compare LIKE fk.fk_columns_compare + '%'
WHERE ia.index_name IS NULL

;WITH Aggregation
AS (
SELECT row_id
,CAST(100. * (user_seeks + user_scans + user_lookups)
/(NULLIF(SUM(user_seeks + user_scans + user_lookups)
OVER(PARTITION BY schema_name, table_name), 0) * 1.) as decimal(12,2)) AS estimated_user_total_pct
,SUM(buffer_mb) OVER(PARTITION BY schema_name, table_name) as table_buffer_mb
FROM #IndexBaseLine
)
UPDATE ibl
SET estimated_user_total_pct = COALESCE(a.estimated_user_total_pct, 0)
,table_buffer_mb = a.table_buffer_mb
FROM #IndexBaseLine ibl
INNER JOIN Aggregation a ON ibl.row_id = a.row_id

;WITH IndexAction
AS (
SELECT row_id
,CASE WHEN user_lookups > user_seeks AND type_desc IN ('CLUSTERED', 'HEAP', 'UNIQUE CLUSTERED') THEN 'REALIGN'
WHEN type_desc = '--MISSING FOREIGN KEY--' THEN 'CREATE'
WHEN type_desc = 'XML' THEN '---'
WHEN is_unique = 1 THEN '---'
WHEN type_desc = '--NONCLUSTERED--' AND ROW_NUMBER() OVER (PARTITION BY table_name ORDER BY user_total desc) <= 10 AND estimated_user_total_pct > 1 THEN 'CREATE'
WHEN type_desc = '--NONCLUSTERED--' THEN 'BLEND'
WHEN ROW_NUMBER() OVER (PARTITION BY table_name ORDER BY user_total desc, existing_ranking) > 10 THEN 'DROP'
WHEN user_total = 0 THEN 'DROP'
ELSE '---' END AS index_action
FROM #IndexBaseLine
)
UPDATE ibl
SET index_action = ia.index_action
FROM #IndexBaseLine ibl INNER JOIN IndexAction ia
ON ibl.row_id = ia.row_id

UPDATE ibl
SET has_unique = 1
FROM #IndexBaseLine ibl
INNER JOIN (SELECT DISTINCT object_id FROM sys.indexes i WHERE i.is_unique = 1) x ON ibl.object_id = x.object_id

SELECT
index_action
,schema_name + '.' + table_name as object_name
,index_name
,is_unique
,has_unique
,type_desc
,size_in_mb
,buffer_mb
,pct_in_buffer
,table_buffer_mb
,row_count
,user_total
,user_total_pct
,estimated_user_total_pct
,user_seeks
,user_scans
,user_lookups
,row_lock_count
,row_lock_wait_count
,row_lock_wait_in_ms
,row_block_pct
,avg_row_lock_waits_ms
,indexed_columns
,included_columns
,duplicate_indexes
,overlapping_indexes
,related_foreign_keys
,related_foreign_keys_xml
FROM #IndexBaseLine
ORDER BY table_buffer_mb DESC, object_id, user_total DESC

NOTE: Populating the table name in the function for the @ObjectID variable is optional.  If the value of @ObjectID is NULL then information on all tables is returned.

This version of the query populates the following columns:

  • row_id: Row identifier used for populating the table
  • index_action: Analysis recommendation on action to take on the index
    • CREATE: Recommend adding the index to the table.
    • DROP: Recommend dropping the index from the table
    • BLEND: Review the missing index details to see if the missing index details can be added to an existing index.
    • REALIGN: Bookmark lookups on the index exceed the number of seeks on the table.  Recommend investigating whether to move the clustered index to another index or add included columns to the indexes that are part of the bookmark lookups.
  • schema_id: Schema ID
  • schema_name: Name of the schema.
  • object_id: Object ID
  • table_name: Name of the table name
  • index_id: Index ID
  • index_name: Name of the index.
  • is_unique: Flag indicating whether an index has a unique index.
  • has_unique: Flag indicating whether the table has a unique index.
  • type_desc: Type of index; either clustered or non-clustered.
  • partition_number: Partition number.
  • reserved_page_count: Total number of pages reserved for the index.
  • size_in_mb: The amount of space in MB the index utilizes on disk.
  • buffered_page_count: Total number of pages in the buffer for the index.
  • buffer_mb: The amount of space in MB in the buffer for the index.
  • pct_in_buffer: The percentage of an index that is current in the SQL Server buffer.
  • table_buffer_mb: The amount of space in MB in the SQL Server buffer that is being utilized by the table.
  • row_count: Number of rows in the index.
  • impact: Calculation of impact of a potential index.  This is based on the seeks and scans that the index could have utilized multiplied by average improvement the index would have provided.  This is included only for missing indexes.
  • existing_ranking: Ranking of the existing indexes ordered by user_total descending across the indexes for the table.
  • user_total: Total number of seek, scan, and lookup operations for the index.
  • user_total_pct: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing indexes for the table.
  • estimated_user_total_pct: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing and potential indexes for the table.  This number is naturally skewed because a seek for potential Index A resulted in another operation on an existing index and both of these operations would be counted.
  • user_seeks: Number of seek operations on the index.
  • user_scans: Number of scan operations on the index.
  • user_lookups: Number of lookup operations on the index.
  • row_lock_count: Cumulative number of row locks requested.
  • row_lock_wait_count: Cumulative number of times the Database Engine waited on a row lock.
  • row_lock_wait_in_ms: Total number of milliseconds the Database Engine waited on a row lock.
  • row_block_pct: Percentage of row locks that encounter waits on a row lock.
  • avg_row_lock_waits_ms: Average number of milliseconds the Database Engine waited on a row lock.
  • indexed_columns: Columns that are part of the index, missing index or foreign key.
  • included_columns: Columns that are included in the index or missing index.
  • indexed_columns_compare: Column IDs that are part of the index, missing index or foreign key
  • included_columns_compare: Column IDs that are included in the index or missing index.
  • duplicate_indexes: List of Indexes that exist on the table that are identical to the index on this row.
  • overlapping_indexes: List of Indexes that exist on the table that overlap the index on this row.
  • related_foreign_keys: List of foreign keys that are related to the index either as an exact match or covering index.
  • related_foreign_keys_xml: XML document listing foreign keys that are related to the index either as an exact match or covering index.

As I mentioned in a previous post, when looking at the index_action column don’t take a literal translation of the recommendation.  These are put together from my personal experience and used as a pointer towards a direction versus a specific directive.

Hopefully breaking apart the index analysis query into multiple posts has helped people grasp why each piece was added and learn a little bit more about the underlying DMVs and catalog views used to put this together.

In future posts some examples will be put together to look at how to take snapshots of indexes with this query.  And also to take the recommendations and implement them to improve the indexing on databases.

EDIT May 18, 2009: Fixed error in aggregating type 2 sys.partition data.

Beta Exam 71-680: TS: Windows 7, Configuring

Microsoft has announced that the Beta exam for 71-680: TS: Windows 7, Configuring has been opened up.  Beta exams are free to take are if you are planning to take the exam at a later date, now is as good a time as any… well a little studying first might help.

It doesn’t appear that the outline for the exam is up on Microsoft Learning yet, but you can get a lot of information from the Windows 7 – Learning Portal.

The beta exam is open through May 18th and I really recommend taking it to help vet out the exam and… well, it’s free.  Be sure to be as wordy as possible in the exam comment section the people that work on the exam put a lot of time into reading comments and adjusting the exams around those comments.

Analyze This – Analyze Your Indexes – Part 5

This is the fifth post in which I am walking through a query that I use to analyze indexes.  Thus far I have added to the following to the query that I use:

  1. Combine existing index statistics and missing indexes into a single output (blog)
  2. Find existing duplicate and overlapping indexes (blog)
  3. Find the relationship between indexes and foreign keys (blog)
  4. Added size and current memory utilization (blog)

While testing the current version of the index analysis query I found an issue with the foreign key portion of the filtering that has been fixed.  Where the code was:

WHERE fkc.constraint_column_id = 1

AND (fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL)

It should have been:

WHERE fkc.constraint_column_id = 1

AND (fkc.referenced_object_id = @ObjectID OR @ObjectID IS NULL)

For the next version of the index analysis query I wanted to build in some guidance on how to handle indexes on the table.  There is a couple things I wanted to be able to tell at a glance without much thought.

  1. When could clustered indexes be realigned on a table to improve performance on a table.  This idea came about after reading this article on MSSQLTips.  The logic used evaluates user_lookups against user_seeks.
  2. When the index count exceed 10 indexes on the table flag the indexes to be dropped.  At some point, I read an index best practices the provided guidance around 5-10 indexes per table in OLTP environments.  If anyone has a link to an article on this please let me know, the one I read I lost track of.
  3. Any indexes that could be built for foreign keys should be flagged to be created.

This information is found in a new columns called index_action.  The new version of the query is as follows:

DECLARE @ObjectID int

SELECT @ObjectID = OBJECT_ID('')

 

;WITH IndexSize

AS(

    SELECT object_id

        ,index_id

        ,CAST((SUM(reserved_page_count) * CAST(8 as float))/1024 as decimal(12,2)) as size_in_mb

    FROM sys.dm_db_partition_stats

    GROUP BY object_id, index_id

)

,preIndexAnalysis

AS (

    SELECT 

        OBJECT_SCHEMA_NAME(t.object_id) as schema_name

        ,t.name as table_name

        ,COALESCE(i.name, 'N/A') as index_name

        ,i.is_unique

        ,CASE WHEN i.is_unique = 1 THEN 'UNIQUE ' ELSE '' END + i.type_desc as type_desc

        ,iz.size_in_mb

        ,NULL as impact

        ,ROW_NUMBER() 

            OVER (PARTITION BY i.object_id ORDER BY i.is_primary_key desc, ius.user_seeks + ius.user_scans + ius.user_lookups desc) as ranking

        ,ius.user_seeks + ius.user_scans + ius.user_lookups as user_total

        ,COALESCE(CAST(100 * (ius.user_seeks + ius.user_scans + ius.user_lookups)

            /(NULLIF(SUM(ius.user_seeks + ius.user_scans + ius.user_lookups) 

            OVER(PARTITION BY i.object_id), 0) * 1.) as decimal(6,2)),0) as user_total_pct

        ,ius.user_seeks

        ,ius.user_scans

        ,ius.user_lookups

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.index_columns ic

                    INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 0

                ORDER BY key_ordinal ASC

                FOR XML PATH('')), 1, 2, '') AS indexed_columns

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.index_columns ic

                    INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 1

                ORDER BY key_ordinal ASC

                FOR XML PATH('')), 1, 2, '') AS included_columns

        ,i.object_id

        ,i.index_id

        ,(SELECT QUOTENAME(ic.column_id,'(')

                FROM sys.index_columns ic

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 0

                ORDER BY key_ordinal ASC

                FOR XML PATH('')) AS indexed_columns_compare

        ,COALESCE((SELECT QUOTENAME(ic.column_id, '(')

                FROM sys.index_columns ic

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 1

                ORDER BY key_ordinal ASC

                FOR XML PATH('')), SPACE(0)) AS included_columns_compare

    FROM sys.tables t

        INNER JOIN sys.indexes i ON t.object_id = i.object_id

        LEFT OUTER JOIN sys.dm_db_index_usage_stats ius ON i.object_id = ius.object_id AND i.index_id = ius.index_id AND ius.database_id = db_id()

        INNER JOIN IndexSize iz ON i.object_id = iz.object_id AND i.index_id = iz.index_id

    WHERE t.object_id = @ObjectID OR @ObjectID IS NULL

    UNION ALL

    SELECT 

        OBJECT_SCHEMA_NAME(mid.object_id) as schema_name

        ,OBJECT_NAME(mid.object_id) as table_name

        ,'--MISSING--'

        ,NULL

        ,'--NONCLUSTERED--'

        ,NULL

        ,(migs.user_seeks + migs.user_scans) * migs.avg_user_impact as impact

        ,0 as ranking

        ,migs.user_seeks + migs.user_scans as user_total

        ,NULL as user_total_pct

        ,migs.user_seeks 

        ,migs.user_scans

        ,0 as user_lookups

        ,COALESCE(equality_columns + ', ', SPACE(0)) + COALESCE(inequality_columns, SPACE(0)) as indexed_columns

        ,included_columns

        ,mid.object_id

        ,NULL

        ,NULL

        ,NULL

    FROM sys.dm_db_missing_index_details mid

        INNER JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle

        INNER JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle

    WHERE database_id = db_id()

    AND (mid.object_id = @ObjectID OR @ObjectID IS NULL)

)

,ForeignKeys

AS (

    SELECT fk.name + '|PARENT' AS name

        ,fkc.parent_object_id AS object_id

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.foreign_key_columns ifkc

                    INNER JOIN sys.columns c ON ifkc.parent_object_id = c.object_id AND ifkc.parent_column_id = c.column_id

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')), 1, 2, '') AS fk_columns

        ,(SELECT QUOTENAME(ifkc.parent_column_id,'(')

                FROM sys.foreign_key_columns ifkc

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')) AS fk_columns_compare

    FROM sys.foreign_keys fk

        INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id

    WHERE fkc.constraint_column_id = 1

    AND (fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL)

    UNION ALL

    SELECT fk.name + '|REFERENCED' as name

        ,fkc.referenced_object_id AS object_id

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.foreign_key_columns ifkc

                    INNER JOIN sys.columns c ON ifkc.referenced_object_id = c.object_id AND ifkc.referenced_column_id = c.column_id

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')), 1, 2, '') AS fk_columns

        ,(SELECT QUOTENAME(ifkc.referenced_column_id,'(')

                FROM sys.foreign_key_columns ifkc

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')) AS fk_columns_compare

    FROM sys.foreign_keys fk

        INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id

    WHERE fkc.constraint_column_id = 1

    AND (fkc.referenced_object_id = @ObjectID OR @ObjectID IS NULL)

)

,MemoryBuffer

AS (

    SELECT 

        obj.object_id

        ,obj.index_id

        ,COUNT(*)AS Buffered_Page_Count

        ,CONVERT(decimal(12,2), CAST(COUNT(*) as bigint)*CAST(8 as float)/1024) as buffer_mb

    FROM sys.dm_os_buffer_descriptors AS bd 

        INNER JOIN 

        (

            SELECT object_name(object_id) AS name 

                ,index_id ,allocation_unit_id, object_id

            FROM sys.allocation_units AS au

                INNER JOIN sys.partitions AS p 

                    ON au.container_id = p.hobt_id 

                        AND (au.type = 1 OR au.type = 3)

            UNION ALL

            SELECT object_name(object_id) AS name   

                ,index_id, allocation_unit_id, object_id

            FROM sys.allocation_units AS au

                INNER JOIN sys.partitions AS p 

                    ON au.container_id = p.hobt_id 

                        AND au.type = 2

        ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id

    WHERE database_id = db_id()

    GROUP BY obj.object_id ,obj.index_id

)

, IndexAnalysis

AS (

    SELECT ia.object_id

        ,ia.index_id

        ,schema_name

        ,table_name

        ,index_name

        ,is_unique

        ,type_desc

        ,size_in_mb

        ,impact

        ,ranking

        ,user_total

        ,user_total_pct

        ,CAST(100 * (user_seeks + user_scans + user_lookups)

            /(NULLIF(SUM(user_seeks + user_scans + user_lookups) 

            OVER(PARTITION BY schema_name, table_name), 0) * 1.) as decimal(6,2)) as estimated_percent

        ,user_seeks

        ,user_scans

        ,user_lookups

        ,indexed_columns

        ,included_columns

        ,STUFF((SELECT ', ' + index_name AS [data()]

            FROM preIndexAnalysis iia

            WHERE ia.object_id = iia.object_id

            AND ia.index_id <> iia.index_id

            AND ia.indexed_columns_compare = iia.indexed_columns_compare

            AND ia.included_columns_compare = iia.included_columns_compare

            FOR XML PATH('')), 1, 2, '') AS duplicate_indexes

        ,STUFF((SELECT ', ' + index_name AS [data()]

            FROM preIndexAnalysis iia

            WHERE ia.object_id = iia.object_id

            AND ia.index_id <> iia.index_id

            AND (ia.indexed_columns_compare LIKE iia.indexed_columns_compare + '%' 

                OR iia.indexed_columns_compare LIKE ia.indexed_columns_compare + '%')

            AND ia.indexed_columns_compare <> iia.indexed_columns_compare 

            FOR XML PATH('')), 1, 2, '') AS overlapping_indexes

        ,STUFF((SELECT ', ' + name AS [data()]

            FROM ForeignKeys ifk

            WHERE ifk.object_id = ia.object_id

            AND ia.indexed_columns_compare LIKE ifk.fk_columns_compare + '%'

            FOR XML PATH('')), 1, 2, '') AS related_foreign_keys

        ,CAST((SELECT name

            FROM ForeignKeys

            WHERE ForeignKeys.object_id = ia.object_id

            AND ia.indexed_columns_compare LIKE ForeignKeys.fk_columns_compare + '%'

            FOR XML AUTO) as xml) AS related_foreign_keys_xml

    FROM preIndexAnalysis ia

    UNION ALL

    SELECT fk.object_id

        ,NULL

        ,OBJECT_SCHEMA_NAME(fk.object_id) AS schema_name

        ,OBJECT_NAME(fk.object_id) AS table_name

        ,fk.name AS index_name

        ,NULL

        ,'--MISSING FOREIGN KEY--'

        ,NULL

        ,NULL 

        ,9999

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL

        ,fk.fk_columns

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL

    FROM ForeignKeys fk

        LEFT OUTER JOIN preIndexAnalysis ia

    ON fk.object_id = ia.object_id

    AND ia.indexed_columns_compare LIKE fk.fk_columns_compare + '%'

    WHERE ia.index_name IS NULL

)

SELECT schema_name

    ,table_name

    ,index_name

    ,type_desc

    ,CASE WHEN user_lookups > user_seeks AND type_desc IN ('CLUSTERED', 'HEAP', 'UNIQUE CLUSTERED') THEN 'REALIGN'

        WHEN type_desc = '--MISSING FOREIGN KEY--' THEN 'CREATE'

        WHEN type_desc = 'XML' THEN '---'

        WHEN is_unique = 1 THEN '---'

        WHEN type_desc = '--NONCLUSTERED--' AND ROW_NUMBER() OVER (PARTITION BY table_name ORDER BY user_total desc, ranking) > 10 THEN 'N/A'

        WHEN ROW_NUMBER() OVER (PARTITION BY table_name ORDER BY user_total desc, ranking) > 10 THEN 'DROP' 

        WHEN user_total = 0 THEN 'DROP' 

        ELSE '---' END as index_action

    ,size_in_mb

    ,mb.buffer_mb

    ,CAST(100*mb.Buffer_MB/NULLIF(size_in_mb,0) AS decimal(6,2)) AS buffer_pct

    ,impact

    ,user_total

    ,user_total_pct

    ,estimated_percent

    ,user_seeks

    ,user_scans

    ,user_lookups

    ,indexed_columns

    ,included_columns

    ,duplicate_indexes

    ,overlapping_indexes

    ,related_foreign_keys

    ,related_foreign_keys_xml

FROM IndexAnalysis ia

    LEFT OUTER JOIN MemoryBuffer mb ON ia.object_id = mb.object_id AND ia.index_id = mb.index_id

ORDER BY SUM(Buffer_MB) OVER (PARTITION BY ia.object_id) desc, table_name, user_total desc, buffer_mb

NOTE: Populating the table name in the function for the @ObjectID variable is optional.  If the value of @ObjectID is NULL then information on all tables is returned.

This version of the DMV query returns the following columns:

  • schema_name: Schema name
  • table_name: Table name
  • index_name: Index Name
  • type_desc: Type of index; either clustered or non-clustered.
  • index_action:  Guidance on how to manage index.  Includes recommendation on whether to CREATE, DROP, or REALIGN the index. 
  • size_in_mb: Then number of MB the index utilizes on disk.
  • buffer_mb: The number of MB in the SQL Server buffer being utilized by the index.
  • buffer_pct: The percentage of an index that is current in the SQL Server buffer.
  • impact: Calculation of impact of a potential index.  This is based on the seeks and scans that the index could have utilized multiplied by average improvement the index would have provided.  This is included only for missing indexes.
  • user_total: Total number of seek, scan, and lookup operations for the index
  • user_total_pct: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing indexes for the table.
  • estimated_percent: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing and potential indexes for the table.  This number is naturally skewed because a seek for potential Index A resulted in another operation on an existing index and both of these operations would be counted.
  • user_seeks: Number of seek operations on the index.
  • user_scans: Number of scan operations on the index.
  • user_lookups: Number of lookup operations on the index.
  • indexed_columns: Columns that are part of the index, missing index or foreign key.
  • included_columns: Columns that are included in the index or missing index.
  • duplicate_indexes: List of Indexes that exist on the table that are identical to the index on this row.
  • overlapping_indexes: List of Indexes that exist on the table that overlap the index on this row.
  • related_foreign_keys: List of foreign keys that are related to the index either as an exact match or covering index.
  • related_foreign_keys_xml: XML document listing foreign keys that are related to the index either as an exact match or covering index.

Keep in mind that the index_action column is only guidance.  It isn’t a 100% or 80% rule.  It is there to provide visual guidance on the interaction of the indexes on the table.

In some cases, I’ve used the REALIGN recommendation to change the clustered index on the table.  But doing this changes the order the table is sorted in and in a lot of cases this can lead to serious fragmentation of the table if the column that appears to be a better clustered index candidate is not ordered similarly to the current clustered index.  In other cases, the REALIGN recommendation has helped to indicate tables where indexes benefitted from adding INCLUDED columns to some of the non-clustered indexes on the table.

I did want to include a comment that I caught on twitter the other day, I’m glad other people are finding this index analysis query that I came up with on the flight to last year’s PASS Summit.

CropperCapture[16]

Speaking of the PASS Summit… if you haven’t started planning for it you definitely should be giving it a good consideration.  Especially when you consider that you can find DBAs and Developers sharing lots of jager there.

Find Query Plans That May Utilize Parallelism


Burnt Server

When ever I go to a new client to assist with performance issues, I inevitably download the Troubleshooting Performance Problems in SLQ Server 2005 white paper.  In my opinion and many others, it is one of those documents that should be at the click of a button to open and I keep it handy on my desktop.

One of the most common wait types that I observe at clients is CXPACKET, known on the street as parallelism.  And since most of these are OLTP database systems, there is a good need for looking into the causes of said wait type.

When I’ve identified that parallelism is the reason for the server running hot and hopefully not catching on fire I start to look at what is running or has run on the server that is utilizing parallelism.  When I first starting to look for execution plans that might use parallelism, I used the query in the white paper (mentioned above) that would return back a list of all of the plans that are currently in the procedure cache that would utilize parallelism if they were selected.

That query is:

SELECT 

    p.*, 

    q.*,

    cp.plan_handle

FROM sys.dm_exec_cached_plans cp

    CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p

    CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) q

WHERE cp.cacheobjtype = 'Compiled Plan' 

AND p.query_plan.value('declare namespace p="http://schemas.microsoft.com/sqlserver/2004/07/showplan";max(//p:RelOp/@Parallel)', 'float') > 0

But recently I got to thinking; and this can be a dangerous game. Do I care about a plan that is only used once? Also, is it better to first look at plans that utilize more CPU than other plans? How do I rank one plan against another plan?

To accomplish this, I turned to the DMV sys.dm_exec_query_stats. This DMV aggregates performance statistics for cached query plans. The trouble with this DMV is that it is the query level and not the procedure level so conditional statements and looping will affect how often some SQL statements are within a procedure. I’ve put together a couple queries that combine the query above with sys.dm_exec_query_stats.

First, I put together the top 50 statements by average CPU utilization that is part of an execution plan that utilizes parallelism. What I like about this query is that I get an exact SQL statement to look at. Unfortunately, this isn’t an exact science because the statement with the high average CPU may not be the statement that is utilizing parallelism. On the other hand, who is going to complain if a high CPU statement without parallelism is tuned?

SELECT TOP 50

    OBJECT_NAME(p.objectid, p.dbid) as [object_name]

    ,qs.execution_count

    ,qs.total_worker_time

    ,qs.total_logical_reads

    ,qs.total_elapsed_time

    ,CASE statement_end_offset WHEN -1 THEN q.text

        ELSE SUBSTRING(q.text, statement_start_offset/2, (statement_end_offset-statement_start_offset)/2) END as sql_statement

    ,p.query_plan

    ,q.text

    ,cp.plan_handle

FROM sys.dm_exec_query_stats qs

    INNER JOIN sys.dm_exec_cached_plans cp ON qs.plan_handle = cp.plan_handle

    CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p

    CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) as q

WHERE cp.cacheobjtype = 'Compiled Plan' 

AND p.query_plan.value('declare namespace p="http://schemas.microsoft.com/sqlserver/2004/07/showplan";max(//p:RelOp/@Parallel)', 'float') > 0

ORDER BY qs.total_worker_time/qs.execution_count DESC

The second query I have here ignores the statement level detail of sys.dm_exec_query_stats and summarizes the views at the plan_handle level. This presents some problems though since execution counts amongst all query statements isn’t always the same within a plan and which number would be best to use? Minimum executions? Maximum executions? Average executions? It all depends which side you want to error on.  For the script, though, I went with maximum executions.

WITH cQueryStats

AS (

    SELECT qs.plan_handle

        ,MAX(qs.execution_count) as execution_count

        ,SUM(qs.total_worker_time) as total_worker_time

        ,SUM(qs.total_logical_reads) as total_logical_reads

        ,SUM(qs.total_elapsed_time) as total_elapsed_time

    FROM sys.dm_exec_query_stats qs

    GROUP BY qs.plan_handle

)

SELECT TOP 50

    OBJECT_NAME(p.objectid, p.dbid) as [object_name] ,qs.execution_count

    ,qs.total_worker_time

    ,qs.total_logical_reads

    ,qs.total_elapsed_time

    ,p.query_plan

    ,q.text

    ,cp.plan_handle

FROM cQueryStats qs

    INNER JOIN sys.dm_exec_cached_plans cp ON qs.plan_handle = cp.plan_handle

    CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p

    CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) as q

WHERE cp.cacheobjtype = 'Compiled Plan' 

AND p.query_plan.value('declare namespace p="http://schemas.microsoft.com/sqlserver/2004/07/showplan";max(//p:RelOp/@Parallel)', 'float') > 0

ORDER BY qs.total_worker_time/qs.execution_count DESC

If you are already on SQL Server 2008, you do have the benefit of having the DMV sys.dm_exec_procedure_stats at your disposal.  Using this DMV will solve the problem of not having accurate execution information at the procedure level.  For those blesses with SQL Server 2008, you can use this query to identify execution statistics for stored plans that utilize parallelism.

SELECT TOP 50

    OBJECT_NAME(p.objectid, p.dbid) as [object_name] 

    ,ps.total_worker_time/ps.execution_count as avg_worker_time

    ,ps.execution_count

    ,ps.total_worker_time

    ,ps.total_logical_reads

    ,ps.total_elapsed_time

    ,p.query_plan

    ,q.text

    ,cp.plan_handle

FROM sys.dm_exec_procedure_stats ps

    INNER JOIN sys.dm_exec_cached_plans cp ON ps.plan_handle = cp.plan_handle

    CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p

    CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) as q

WHERE cp.cacheobjtype = 'Compiled Plan' 

AND p.query_plan.value('declare namespace p="http://schemas.microsoft.com/sqlserver/2004/07/showplan";max(//p:RelOp/@Parallel)', 'float') > 0

ORDER BY ps.total_worker_time/ps.execution_count DESC

If anyone has any queries that they use to address and research parallelism in their environments, I would be more than happy to see them.

PASS Summit Submissions Complete

I’m as good at procrastination as the next person.   And because of that I waited until last night to finish my PASS Summit session submissions.  I’d actually written them all up a couple months ago but hadn’t put together the goals for each session.

I am hoping that I get a session selected and will admit that I might have submitted too many.  I submitted about half of the presentations that I am interested in currently writing.  When a good idea passes across my desk, laptop, fingers… it goes into a presentation idea document.

Since others have shared their submissions, I thought I’d share a few of mine.  I probably submitted too many presentations but there wasn’t any imposed restrictions and I am sure the people on the selection committee will love reading them all.

So here goes:

Improving Daily Imports with Partitioned Tables

At some point, everyone struggles with loading data to OLTP systems. The business needs the data loaded so that users can work through the data. But the users can’t afford the downtime to get the data into the production tables. In this session we’ll review this problem and how to use partitioning to alleviate this issue.

Prerequisites

  • Basic understanding for partitioned table concepts
  • Familiarity with OLTP systems

Goals:

  1. Identify situations where partitioned tables can improve database performance
  2. Demonstrate partitioned table solution for daily imports
  3. Provide understanding of the pro’s and con’s to selecting partitioned tables as a import solution for an OLTP system

Getting To Know Your Indexes

Without proper indexing SQL Server can be hard pressed to create efficient and performant execution plans. Dynamic Management Views (DMV) and system views provide a slew of information about indexes that can be used to analyze indexes within SQL Server. In this session we’ll go under the hood of SQL Server to look at DMVs and system views to know what indexes you have, should have, and how they feel about the way applications are treating them.

Prerequisites

  • Understanding of performance tuning needs
  • Experience with Database design

Goals:

  1. Identify methods to analyze current and potential indexes
  2. Learn how to alleviate stress found on indexes
  3. Demonstrate methods for tuning indexes

Necessarily Evils, Building Optimized CRUD Procedures

Every developer loves them and a lot of DBAs hate them. But there are many and valid reasons for creating generic SELECT, INSERT, UPDATE, and DELETE procedures. In this session, we’ll go through designing CRUD procedures that utilize new and existing SQL features to create CRUD procedures that are optimized for performance.

Prerequisites

  • Understand application data access requirements
  • Understanding of basic T-SQL coding skills

Goals:

  1. Identify the uses for CRUD procedures in databases
  2. Identify the common problems associated with CRUD procedures
  3. Demonstrate methods for writing performant CRUD procedures

Last but not least, have you registered for the PASS Summit yet?  It’s less than 7 months away.  I’m going… though I need to get travel worked out still.

Analyze This – Analyze Your Indexes – Part 4

This is the fourth post in which I am walking through a query that I use to analyze indexes.  Thus far I have added to the following to the query that I use:

  1. Combine existing index statistics and missing indexes into a single output (blog)
  2. Find existing duplicate and overlapping indexes (blog)
  3. Find the relationship between indexes and foreign keys (blog)man with blinders

A couple issues that were previously brought up have been updated:

  • Foreign keys with multiple columns were causing some duplication.  Yeah, I knew about that but my horse blinders allowed me to forget about it when I shouldn’t have.
  • For the duplicate and overlapping index logic, index_column_id was used to order the columns and it should have been the key_ordinal column.

In this next post I want to add more information about how the indexes are currently being used.  Specifically, what is the approximate size of the index and how much of the index is currently in memory.

Ce3k-mashedThis means something…

Why would I be interested in the size of my indexes and how much is in memory?  There are a few reasons for doing this

To start with, indexes take up space.  That should be a no-brainer.  But knowing how much space an index is utilizing can provide enlightenment into how the space is distributed across all of your indexes.  Space is relatively cheap these days but throwing a few gigs of disk at a logging table might not be a good use of disk.  Especially if it is over indexed or has some poor data type choices.

Now just knowing how large and index is doesn’t always provide the full picture.  Because data that is at rest only consume disk resources.  Determining the amount of the table that is the buffer provides insight into how important that data in the table/index is to the rest of the database.

I get into conversations with client’s often about what data is most important in their database systems and depending on who I talk to this view can be quite different.  In my mind though, the tables/indexes that are in memory are the most important.  They are the most important because the applications and services using the database felt they were important (because of queries and donuts) to load them from disk to memory.

But what about when these aren’t the important tables/indexes in the database?  That means there is a lot of work to do to add appropriate indexes, tune up some of the queries, or archive off some inappropriate data.

I mention the last one because of a conversation I ran into a while back.  A friend had shared with me a situation where the largest tables in memory across one of his server’s were the msdb backup tables.  A few years of backup entries for every 15 minutes across a number of databases wasn’t really needed.  But the backup jobs forced them up there every 15 when the backups occurred.  Loading all of that data into memory caused other data to be pushed out and caused some of the memory pressure that had been occurring.

Onto the next version…

Today’s version of the will include both the current size of the index and the portion of the index that is currently in memory.  To determine the size of the index, I used the following post.  But for the amount of the index I had to look around a bit and found an post on SQLTeam that aggregated the information that I was looking for.

The version below combines the new components and the fixes from above:

DECLARE @ObjectID int

SELECT @ObjectID = OBJECT_ID('Person.Contact')

 

;WITH IndexSize

AS(

    SELECT object_id

        ,index_id

        ,CAST((SUM(reserved_page_count) * CAST(8 as float))/1024 as decimal(12,2)) as size_in_mb

    FROM sys.dm_db_partition_stats

    GROUP BY object_id, index_id

)

,preIndexAnalysis

AS (

    SELECT 

        OBJECT_SCHEMA_NAME(t.object_id) as schema_name

        ,t.name as table_name

        ,COALESCE(i.name, 'N/A') as index_name

        ,CASE WHEN i.is_unique = 1 THEN 'UNIQUE ' ELSE '' END + i.type_desc as type_desc

        ,iz.size_in_mb

        ,NULL as impact

        ,ROW_NUMBER() 

            OVER (PARTITION BY i.object_id ORDER BY i.is_primary_key desc, ius.user_seeks + ius.user_scans + ius.user_lookups desc) as ranking

        ,ius.user_seeks + ius.user_scans + ius.user_lookups as user_total

        ,COALESCE(CAST(100 * (ius.user_seeks + ius.user_scans + ius.user_lookups)

            /(NULLIF(SUM(ius.user_seeks + ius.user_scans + ius.user_lookups) 

            OVER(PARTITION BY i.object_id), 0) * 1.) as decimal(6,2)),0) as user_total_pct

        ,ius.user_seeks

        ,ius.user_scans

        ,ius.user_lookups

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.index_columns ic

                    INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 0

                ORDER BY key_ordinal ASC

                FOR XML PATH('')), 1, 2, '') AS indexed_columns

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.index_columns ic

                    INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 1

                ORDER BY key_ordinal ASC

                FOR XML PATH('')), 1, 2, '') AS included_columns

        ,i.object_id

        ,i.index_id

        ,(SELECT QUOTENAME(ic.column_id,'(')

                FROM sys.index_columns ic

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 0

                ORDER BY key_ordinal ASC

                FOR XML PATH('')) AS indexed_columns_compare

        ,COALESCE((SELECT QUOTENAME(ic.column_id, '(')

                FROM sys.index_columns ic

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 1

                ORDER BY key_ordinal ASC

                FOR XML PATH('')), SPACE(0)) AS included_columns_compare

    FROM sys.tables t

        INNER JOIN sys.indexes i ON t.object_id = i.object_id

        LEFT OUTER JOIN sys.dm_db_index_usage_stats ius ON i.object_id = ius.object_id AND i.index_id = ius.index_id AND ius.database_id = db_id()

        INNER JOIN IndexSize iz ON i.object_id = iz.object_id AND i.index_id = iz.index_id

    WHERE t.object_id = @ObjectID OR @ObjectID IS NULL

    UNION ALL

    SELECT 

        OBJECT_SCHEMA_NAME(mid.object_id) as schema_name

        ,OBJECT_NAME(mid.object_id) as table_name

        ,'--MISSING--'

        ,'--NONCLUSTERED--'

        ,NULL

        ,(migs.user_seeks + migs.user_scans) * migs.avg_user_impact as impact

        ,0 as ranking

        ,migs.user_seeks + migs.user_scans as user_total

        ,NULL as user_total_pct

        ,migs.user_seeks 

        ,migs.user_scans

        ,0 as user_lookups

        ,COALESCE(equality_columns + ', ', SPACE(0)) + COALESCE(inequality_columns, SPACE(0)) as indexed_columns

        ,included_columns

        ,mid.object_id

        ,NULL

        ,NULL

        ,NULL

    FROM sys.dm_db_missing_index_details mid

        INNER JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle

        INNER JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle

    WHERE database_id = db_id()

    AND (mid.object_id = @ObjectID OR @ObjectID IS NULL)

)

,ForeignKeys

AS (

    SELECT fk.name + '|PARENT' AS name

        ,fkc.parent_object_id AS object_id

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.foreign_key_columns ifkc

                    INNER JOIN sys.columns c ON ifkc.parent_object_id = c.object_id AND ifkc.parent_column_id = c.column_id

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')), 1, 2, '') AS fk_columns

        ,(SELECT QUOTENAME(ifkc.parent_column_id,'(')

                FROM sys.foreign_key_columns ifkc

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')) AS fk_columns_compare

    FROM sys.foreign_keys fk

        INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id

    WHERE fkc.constraint_column_id = 1

    AND (fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL)

    UNION ALL

    SELECT fk.name + '|REFERENCED' as name

        ,fkc.referenced_object_id AS object_id

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.foreign_key_columns ifkc

                    INNER JOIN sys.columns c ON ifkc.referenced_object_id = c.object_id AND ifkc.referenced_column_id = c.column_id

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')), 1, 2, '') AS fk_columns

        ,(SELECT QUOTENAME(ifkc.referenced_column_id,'(')

                FROM sys.foreign_key_columns ifkc

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')) AS fk_columns_compare

    FROM sys.foreign_keys fk

        INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id

    WHERE fkc.constraint_column_id = 1

    AND (fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL)

)

,MemoryBuffer

AS (

    SELECT 

        obj.object_id

        ,obj.index_id

        ,COUNT(*)AS Buffered_Page_Count

        ,CAST(COUNT(*) as bigint)*CAST(8 as float)/1024 as Buffer_MB

    FROM sys.dm_os_buffer_descriptors AS bd 

        INNER JOIN 

        (

            SELECT object_name(object_id) AS name 

                ,index_id ,allocation_unit_id, object_id

            FROM sys.allocation_units AS au

                INNER JOIN sys.partitions AS p 

                    ON au.container_id = p.hobt_id 

                        AND (au.type = 1 OR au.type = 3)

            UNION ALL

            SELECT object_name(object_id) AS name   

                ,index_id, allocation_unit_id, object_id

            FROM sys.allocation_units AS au

                INNER JOIN sys.partitions AS p 

                    ON au.container_id = p.hobt_id 

                        AND au.type = 2

        ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id

    WHERE database_id = db_id()

    GROUP BY obj.object_id ,obj.index_id

)

, IndexAnalysis

AS (

    SELECT ia.object_id

        ,ia.index_id

        ,schema_name

        ,table_name

        ,index_name

        ,type_desc

        ,size_in_mb

        ,impact

        ,ranking

        ,user_total

        ,user_total_pct

        ,CAST(100 * (user_seeks + user_scans + user_lookups)

            /(NULLIF(SUM(user_seeks + user_scans + user_lookups) 

            OVER(PARTITION BY schema_name, table_name), 0) * 1.) as decimal(6,2)) as estimated_percent

        ,user_seeks

        ,user_scans

        ,user_lookups

        ,indexed_columns

        ,included_columns

        ,STUFF((SELECT ', ' + index_name AS [data()]

            FROM preIndexAnalysis iia

            WHERE ia.object_id = iia.object_id

            AND ia.index_id <> iia.index_id

            AND ia.indexed_columns_compare = iia.indexed_columns_compare

            AND ia.included_columns_compare = iia.included_columns_compare

            FOR XML PATH('')), 1, 2, '') AS duplicate_indexes

        ,STUFF((SELECT ', ' + index_name AS [data()]

            FROM preIndexAnalysis iia

            WHERE ia.object_id = iia.object_id

            AND ia.index_id <> iia.index_id

            AND (ia.indexed_columns_compare LIKE iia.indexed_columns_compare + '%' 

                OR iia.indexed_columns_compare LIKE ia.indexed_columns_compare + '%')

            AND ia.indexed_columns_compare <> iia.indexed_columns_compare 

            FOR XML PATH('')), 1, 2, '') AS overlapping_indexes

        ,STUFF((SELECT ', ' + name AS [data()]

            FROM ForeignKeys ifk

            WHERE ifk.object_id = ia.object_id

            AND ia.indexed_columns_compare LIKE ifk.fk_columns_compare + '%'

            FOR XML PATH('')), 1, 2, '') AS related_foreign_keys

        ,CAST((SELECT name

            FROM ForeignKeys

            WHERE ForeignKeys.object_id = ia.object_id

            AND ia.indexed_columns_compare LIKE ForeignKeys.fk_columns_compare + '%'

            FOR XML AUTO) as xml) AS related_foreign_keys_xml

    FROM preIndexAnalysis ia

    UNION ALL

    SELECT fk.object_id

        ,NULL

        ,OBJECT_SCHEMA_NAME(fk.object_id) AS schema_name

        ,OBJECT_NAME(fk.object_id) AS table_name

        ,fk.name AS index_name

        ,'--MISSING FOREIGN KEY--'

        ,NULL

        ,NULL 

        ,NULL

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL

        ,fk.fk_columns

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL

    FROM ForeignKeys fk

        LEFT OUTER JOIN preIndexAnalysis ia

    ON fk.object_id = ia.object_id

    AND ia.indexed_columns_compare LIKE fk.fk_columns_compare + '%'

    WHERE ia.index_name IS NULL

)

SELECT schema_name

    ,table_name

    ,index_name

    ,type_desc

    ,size_in_mb

    ,mb.Buffer_MB

    ,CAST(100*mb.Buffer_MB/NULLIF(size_in_mb,0) AS decimal(6,2)) AS buffer_pct

    ,impact

    ,user_total

    ,user_total_pct

    ,estimated_percent

    ,user_seeks

    ,user_scans

    ,user_lookups

    ,indexed_columns

    ,included_columns

    ,duplicate_indexes

    ,overlapping_indexes

    ,related_foreign_keys

FROM IndexAnalysis ia

    LEFT OUTER JOIN MemoryBuffer mb ON ia.object_id = mb.object_id AND ia.index_id = mb.index_id

ORDER BY SUM(Buffer_MB ) OVER (PARTITION BY ia.object_id) desc, table_name, user_total desc, buffer_mb

NOTE: Populating the table name in the function for the @ObjectID variable is optional.  If the value of @ObjectID is NULL then information on all tables is returned.

This version of the DMV query returns the following columns:

  • schema_name: Schema name
  • table_name: Table name
  • index_name: Index Name
  • type_desc: Type of index; either clustered or non-clustered.
  • size_in_mb: Then number of MB the index utilizes on disk.
  • buffer_mb: The number of MB in the SQL Server buffer being utilized by the index.
  • buffer_pct: The percentage of an index that is current in the SQL Server buffer.
  • impact: Calculation of impact of a potential index.  This is based on the seeks and scans that the index could have utilized multiplied by average improvement the index would have provided.  This is included only for missing indexes.
  • user_total: Total number of seek, scan, and lookup operations for the index
  • user_total_pct: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing indexes for the table.
  • estimated_percent: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing and potential indexes for the table.  This number is naturally skewed because a seek for potential Index A resulted in another operation on an existing index and both of these operations would be counted.
  • user_seeks: Number of seek operations on the index.
  • user_scans: Number of scan operations on the index.
  • user_lookups: Number of lookup operations on the index.
  • indexed_columns: Columns that are part of the index, missing index or foreign key.
  • included_columns: Columns that are included in the index or missing index.
  • duplicate_indexes: List of Indexes that exist on the table that are identical to the index on this row.
  • overlapping_indexes: List of Indexes that exist on the table that overlap the index on this row.
  • related_foreign_keys: List of foreign keys that are related to the index either as an exact match or covering index.
  • related_foreign_keys_xml: XML document listing foreign keys that are related to the index either as an exact match or covering index.

To review the results of the new components added, I ran a query that caused a table scan against Person.Contact.  As in the previous post, I’ve removed the columns from the image that do not pertain to this output.

CropperCapture[13]

The output shows that nearly the entire Person.Contact.PK_Contact_ContactID index has been moved into memory.  The other two indexes are not utilizing that much space in memory.

One last thing, the ORDER BY statement has been changed to order all of the tables the total amount of space all of the indexes for the table are currently utilizing in memory.  Through this the tables with the current highest buffer usage will bubble to the top.

Ok, one more “one last thing”… buffer utilizing of the index is highly dependent on the most recent usage of the database.  Out of character queries can change the data in the buffer and skew results.  A single execution of this index analysis should not be used to state definitively whether a table or index is moved into the buffer that often.

What is the size of your index?

I got the following question a couple days ago:

“I have one problem with my database. The size of mdf file is huge and i feel it is not the data that is causing the increase in size, i feel indexes might be occupying space.  Is there any way to find that indexes are occupying space or data is occupying space. Thanks in advance”

341850-421894I jumped out to my blog because I’d already written something up on that, only to find that it was still in my draft blog folder and wasn’t really written.  So time to finish it…

Indexes can and do take up a significant amount of space in a database.  And in a lot of the databases, that I’ve worked on, the the non-clustered indexes take up more space than the clustered index or heap does.

Size information for indexes can found in the dynamic management view sys.dm_db_index_usage_stats.  This view provides partition level detail on row and page counts for indexes.  The page counts return reserved, data, and used counts for each partition.

I use the following query for a breakdown of size per partition/index/table:

SELECT 

    OBJECT_SCHEMA_NAME(i.object_id) as schema_name

    ,OBJECT_NAME(i.object_id) as table_name

    ,COALESCE(i.name, space(0)) as index_name

    ,ps.partition_number

    ,ps.row_count

    ,CAST(((ps.in_row_data_page_count) * CAST(8 as float))/1024. as decimal(12,2)) as data_size_in_mb

    ,CAST(((ps.in_row_used_page_count + ps.lob_used_page_count + ps.row_overflow_used_page_count) * CAST(8 as float))/1024. as decimal(12,2)) as used_size_in_mb

    ,CAST((ps.reserved_page_count * CAST(8 as float))/1024. as decimal(12,2)) as total_size_in_mb

    ,i.type_desc

FROM sys.indexes i 

    INNER JOIN sys.dm_db_partition_stats ps ON i.object_id = ps.object_id AND i.index_id = ps.index_id 

ORDER BY object_name(i.object_id), total_size_in_mb desc

The output of which looks a like this:

CropperCapture[12]

As is shown in the first row, the index PK_Address_AddressID on Person.Address has 19,614 rows and takes up 2.20 MB of disk space.  Pretty simple.

PASSMN – Drive Your Way to the DMV


Chicago005
The Minnesota SQL Server User Group (http://www.mnssug.org) met this past Tuesday.  It was a good crowd for a beautiful Tuesday afternoon – there were 57 people at the meeting at the mid-point.

I spoke in the first time slot and think people enjoyed what they heard.  There were a couple comments on the session that I’d like to point out since I agree with them a bit.

The first comment was “too much information in too little time”.  Very good point, there are 130 DMVs and I covered way too many for 45 minutes.

The second comment was “there wasn’t an overall theme”.  I agree with this as well.  Instead of presenting a business case for each use, I went through the DMVs and pointed of different features and functionality.  I should have tied this all together a bit better.

Both of those comments may inspire some future posts where I’ll provide some meaningful depth to the DMVs.  So be warned… I may start posting useful information.

For those that want copies of the scripts and the slide deck, here you go…

Drive Your Way to the DMV – PASSMN.zip

Analyze This – Analyze Your Indexes – Part 3

This is the third post in which I am walking through a query that I use to analyze indexes.  Thus far I have added to the following to the query that I use:

  1. Combine existing index statistics and missing indexes into a single output (blog)
  2. Find existing duplicate and overlapping indexes (blog)

An error in the WHERE clause for the missing index portion of the query was pointed out in the last post and is fixed in the current version.  The SQL code read:

WHERE database_id = db_id()

AND mid.object_id = @ObjectID OR @ObjectID IS NULL

The correct code should be:

WHERE database_id = db_id()

AND (mid.object_id = @ObjectID OR @ObjectID IS NULL)

Today I am looking at foreign keys and adding in functionality to determine indexes that are related to foreign keys and possibly missing from the indexing strategy.  A while back I read Greg Lowe’s post on whether foreign keys should be indexed automatically.  I’m not a fan of that approach even though in a lot of situations that I’ve been in working on performance tuning this would have been a good idea.

Instead, I’d like to know which foreign keys are related to existing indexes through either exact or covering scenarios.  The new version of this query has two new CTEs added to it.  The first new CTE is ForeignKeys which summarizes meta-data on the existing foreign keys.  The second new CTE is IndexAnalysis which  UNIONS the previous result set with a list of foreign keys that are not indexed.

Version 3 of the query is as follows:

DECLARE @ObjectID int

SELECT @ObjectID = OBJECT_ID('Production.BillOfMaterials')

 

;WITH preIndexAnalysis

AS (

    SELECT 

        OBJECT_SCHEMA_NAME(t.object_id) as schema_name

        ,t.name as table_name

        ,COALESCE(i.name, 'N/A') as index_name

        ,CASE WHEN i.is_unique = 1 THEN 'UNIQUE ' ELSE '' END + i.type_desc as type_desc

        ,NULL as impact

        ,ROW_NUMBER() 

            OVER (PARTITION BY i.object_id ORDER BY i.is_primary_key desc, ius.user_seeks + ius.user_scans + ius.user_lookups desc) as ranking

        ,ius.user_seeks + ius.user_scans + ius.user_lookups as user_total

        ,COALESCE(CAST(100 * (ius.user_seeks + ius.user_scans + ius.user_lookups)

            /(NULLIF(SUM(ius.user_seeks + ius.user_scans + ius.user_lookups) 

            OVER(PARTITION BY i.object_id), 0) * 1.) as decimal(6,2)),0) as user_total_pct

        ,ius.user_seeks

        ,ius.user_scans

        ,ius.user_lookups

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.index_columns ic

                    INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 0

                ORDER BY index_column_id ASC

                FOR XML PATH('')), 1, 2, '') AS indexed_columns

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.index_columns ic

                    INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 1

                ORDER BY index_column_id ASC

                FOR XML PATH('')), 1, 2, '') AS included_columns

        ,i.object_id

        ,i.index_id

        ,(SELECT QUOTENAME(ic.column_id,'(')

                FROM sys.index_columns ic

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 0

                ORDER BY index_column_id ASC

                FOR XML PATH('')) AS indexed_columns_compare

        ,COALESCE((SELECT QUOTENAME(ic.column_id, '(')

                FROM sys.index_columns ic

                WHERE i.object_id = ic.object_id

                AND i.index_id = ic.index_id

                AND is_included_column = 1

                ORDER BY index_column_id ASC

                FOR XML PATH('')), SPACE(0)) AS included_columns_compare

    FROM sys.tables t

        INNER JOIN sys.indexes i ON t.object_id = i.object_id

        LEFT OUTER JOIN sys.dm_db_index_usage_stats ius ON i.object_id = ius.object_id AND i.index_id = ius.index_id AND ius.database_id = db_id()

    WHERE t.object_id = @ObjectID OR @ObjectID IS NULL

    UNION ALL

    SELECT 

        OBJECT_SCHEMA_NAME(mid.object_id) as schema_name

        ,OBJECT_NAME(mid.object_id) as table_name

        ,'--MISSING--'

        ,'--NONCLUSTERED--'

        ,(migs.user_seeks + migs.user_scans) * migs.avg_user_impact as impact

        ,0 as ranking

        ,migs.user_seeks + migs.user_scans as user_total

        ,NULL as user_total_pct

        ,migs.user_seeks 

        ,migs.user_scans

        ,0 as user_lookups

        ,COALESCE(equality_columns + ', ', SPACE(0)) + COALESCE(inequality_columns, SPACE(0)) as indexed_columns

        ,included_columns

        ,mid.object_id

        ,NULL

        ,NULL

        ,NULL

    FROM sys.dm_db_missing_index_details mid

        INNER JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle

        INNER JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle

    WHERE database_id = db_id()

    AND (mid.object_id = @ObjectID OR @ObjectID IS NULL)

)

,ForeignKeys

AS (

    SELECT fk.name + '|PARENT' AS name

        ,fkc.parent_object_id AS object_id

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.foreign_key_columns ifkc

                    INNER JOIN sys.columns c ON ifkc.parent_object_id = c.object_id AND ifkc.parent_column_id = c.column_id

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')), 1, 2, '') AS fk_columns

        ,(SELECT QUOTENAME(ifkc.parent_column_id,'(')

                FROM sys.foreign_key_columns ifkc

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')) AS fk_columns_compare

    FROM sys.foreign_keys fk

        INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id

    WHERE fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL

    UNION ALL

    SELECT fk.name + '|REFERENCED' as name

        ,fkc.referenced_object_id AS object_id

        ,STUFF((SELECT ', ' + QUOTENAME(c.name)

                FROM sys.foreign_key_columns ifkc

                    INNER JOIN sys.columns c ON ifkc.referenced_object_id = c.object_id AND ifkc.referenced_column_id = c.column_id

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')), 1, 2, '') AS fk_columns

        ,(SELECT QUOTENAME(ifkc.referenced_column_id,'(')

                FROM sys.foreign_key_columns ifkc

                WHERE fk.object_id = ifkc.constraint_object_id

                ORDER BY ifkc.constraint_column_id

                FOR XML PATH('')) AS fk_columns_compare

    FROM sys.foreign_keys fk

        INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id

    WHERE fkc.referenced_object_id = @ObjectID OR @ObjectID IS NULL

), IndexAnalysis

AS (

    SELECT schema_name

        ,table_name

        ,index_name

        ,type_desc

        ,impact

        ,ranking

        ,user_total

        ,user_total_pct

        ,CAST(100 * (user_seeks + user_scans + user_lookups)

            /(NULLIF(SUM(user_seeks + user_scans + user_lookups) 

            OVER(PARTITION BY schema_name, table_name), 0) * 1.) as decimal(6,2)) as estimated_percent

        ,user_seeks

        ,user_scans

        ,user_lookups

        ,indexed_columns

        ,included_columns

        ,STUFF((SELECT ', ' + index_name AS [data()]

            FROM preIndexAnalysis iia

            WHERE ia.object_id = iia.object_id

            AND ia.index_id <> iia.index_id

            AND ia.indexed_columns_compare = iia.indexed_columns_compare

            AND ia.included_columns_compare = iia.included_columns_compare

            FOR XML PATH('')), 1, 2, '') AS duplicate_indexes

        ,STUFF((SELECT ', ' + index_name AS [data()]

            FROM preIndexAnalysis iia

            WHERE ia.object_id = iia.object_id

            AND ia.index_id <> iia.index_id

            AND (ia.indexed_columns_compare LIKE iia.indexed_columns_compare + '%' 

                OR iia.indexed_columns_compare LIKE ia.indexed_columns_compare + '%')

            AND ia.indexed_columns_compare <> iia.indexed_columns_compare 

            FOR XML PATH('')), 1, 2, '') AS overlapping_indexes

        ,STUFF((SELECT ', ' + name AS [data()]

            FROM ForeignKeys ifk

            WHERE ifk.object_id = ia.object_id

            AND ia.indexed_columns_compare LIKE ifk.fk_columns_compare + '%'

            FOR XML PATH('')), 1, 2, '') AS related_foreign_keys

        ,CAST((SELECT name

            FROM ForeignKeys

            WHERE ForeignKeys.object_id = ia.object_id

            AND ia.indexed_columns_compare LIKE ForeignKeys.fk_columns_compare + '%'

            FOR XML AUTO) as xml) AS related_foreign_keys_xml

    FROM preIndexAnalysis ia

    UNION ALL

    SELECT OBJECT_SCHEMA_NAME(fk.object_id) AS schema_name

        ,OBJECT_NAME(fk.object_id) AS table_name

        ,fk.name AS index_name

        ,'--MISSING FOREIGN KEY--'

        ,NULL

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL

        ,fk.fk_columns

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL 

        ,NULL

    FROM ForeignKeys fk

        LEFT OUTER JOIN preIndexAnalysis ia

    ON fk.object_id = ia.object_id

    AND ia.indexed_columns_compare LIKE fk.fk_columns_compare + '%'

    WHERE ia.index_name IS NULL

)

SELECT schema_name

    ,table_name

    ,index_name

    ,type_desc

    ,impact

    ,user_total

    ,user_total_pct

    ,estimated_percent

    ,user_seeks

    ,user_scans

    ,user_lookups

    ,indexed_columns

    ,included_columns

    ,duplicate_indexes

    ,overlapping_indexes

    ,related_foreign_keys

FROM IndexAnalysis

ORDER BY 1, 2, 3

NOTE: Populating the table name in the function for the @ObjectID variable is optional.  If the value of @ObjectID is NULL then information on all tables is returned.  And it actually works properly now.

This version of the DMV query returns the following columns:

  • schema_name: Schema name
  • table_name: Table name
  • index_name: Index Name
  • type_desc: Type of index; either clustered or non-clustered.
  • impact: Calculation of impact of a potential index.  This is based on the seeks and scans that the index could have utilized multiplied by average improvement the index would have provided.  This is included only for missing indexes.
  • user_total: Total number of seek, scan, and lookup operations for the index
  • user_total_pct: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing indexes for the table.
  • estimated_percent: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing and potential indexes for the table.  This number is naturally skewed because a seek for potential Index A resulted in another operation on an existing index and both of these operations would be counted.
  • user_seeks: Number of seek operations on the index.
  • user_scans: Number of scan operations on the index.
  • user_lookups: Number of lookup operations on the index.
  • indexed_columns: Columns that are part of the index, missing index or foreign key.
  • included_columns: Columns that are included in the index or missing index.
  • duplicate_indexes: List of Indexes that exist on the table that are identical to the index on this row.
  • overlapping_indexes: List of Indexes that exist on the table that overlap the index on this row.
  • related_foreign_keys: List of foreign keys that are related to the index either as an exact match or covering index.
  • related_foreign_keys_xml: XML document listing foreign keys that are related to the index either as an exact match or covering index.

The last column was added to the query to allow the results to be clicked and opened in a new window.  I’ve found that it is easier to do this with list information from time to time.

With this version of the query there isn’t any need to run and SQL statements to get results, the query is set to look at Production.BillOfMaterials and that table provides a useful example.   As with previous version’s example, the image below of the query results will only include a limited list of the columns.  Only the following columns are included; schema_name, table_name, index_name, type_desc indexed_columns, and related_foreign_keys.

CropperCapture[10]

Reviewing these results, the following can be determined:

  1. The foreign key FK_BillOfMaterials_Product_ProductAssemblyID is able to utilize the index AK_BillOfMaterials_ProductAssemblyID_ComponentID_StartDate.
  2. The foreign key FK_BillOfMaterials_UnitMeasure_UnitMeasureCode|PARENT is able to utilize the index IX_BillOfMaterials_UnitMeasureCode.
  3. The foreign key FK_BillOfMaterials_Product_ComponentID is not able to utilize any index on the table and could benefit from an index on the column [ComponentID].

Now whether you should or should not index all foreign keys is a matter for another post.  It really depends on the table, the foreign key, and how the database is being used.  In some cases there will a performance gain on the indexing and in others the opposite.  This version of the query provides a method to analyze indexes in relationship to foreign keys.

CAUTION: I should note that I haven’t extensively tested the foreign key portion of this query.  I had planned to add it in a couple months ago and didn’t get it in until this post.  There are other components that I plan to add to this query that have been tested, but haven’t been added yet.  It just made sense to add foreign key analysis in at this point.

Analyze This – Analyze Your Indexes – Part 2

This is the second post in which I am walking through a query that I use to analyze indexes.  For some reason the grammar on the title of the first post was a bit jacked so I’ve corrected it for this post and the all following ones.

Here’s a summary of what I’ve added to this query thus far:

  1. Combine existing index statistics and missing indexes into a single output (blog)

The next thing the index analysis needed was a method to determine whether or not an existing index was either a duplicate of or overlapped another index on the table.  For an explanation of the logic to find duplicate and overlapping index please read Paul Nielsen’s blog, which happens to be the source of the logic that the query uses.

The query I added to the index analysis is basically the same except the SQL Server 2005/2008 catalog view (sys.indexed_columns) is are being used instead of the SQL Server 2000 and prior system view (sys.sysindexkeys).

Version 2 of the query is as follows:

DECLARE @ObjectID int

SELECT @ObjectID = OBJECT_ID('Production.BillOfMaterials')

 

;WITH preIndexAnalysis

AS (

    SELECT 

        OBJECT_SCHEMA_NAME(t.object_id) as schema_name

        ,t.name as table_name

        ,COALESCE(i.name, 'N/A') as index_name

        ,CASE WHEN i.is_unique = 1 THEN 'UNIQUE ' ELSE '' END + i.type_desc as type_desc

        ,NULL as impact

        ,ROW_NUMBER() 

            OVER (PARTITION BY i.object_id ORDER BY i.is_primary_key desc, ius.user_seeks + ius.user_scans + ius.user_lookups desc) as ranking

        ,ius.user_seeks + ius.user_scans + ius.user_lookups as user_total

        ,COALESCE(CAST(100 * (ius.user_seeks + ius.user_scans + ius.user_lookups)

            /(NULLIF(SUM(ius.user_seeks + ius.user_scans + ius.user_lookups)