This is the fourth post in which I am walking through a query that I use to analyze indexes. Thus far I have added to the following to the query that I use:
- Combine existing index statistics and missing indexes into a single output
- Find existing duplicate and overlapping indexes
- Find the relationship between indexes and foreign keys
A couple issues that were previously brought up have been updated:
- Foreign keys with multiple columns were causing some duplication. Yeah, I knew about that but my horse blinders allowed me to forget about it when I shouldn’t have.
- For the duplicate and overlapping index logic, index_column_id was used to order the columns and it should have been the key_ordinal column.
In this next post I want to add more information about how the indexes are currently being used. Specifically, what is the approximate size of the index and how much of the index is currently in memory.
This means something…
Why would I be interested in the size of my indexes and how much is in memory? There are a few reasons for doing this
To start with, indexes take up space. That should be a no-brainer. But knowing how much space an index is utilizing can provide enlightenment into how the space is distributed across all of your indexes. Space is relatively cheap these days but throwing a few gigs of disk at a logging table might not be a good use of disk. Especially if it is over indexed or has some poor data type choices.
Now just knowing how large and index is doesn’t always provide the full picture. Because data that is at rest only consume disk resources. Determining the amount of the table that is the buffer provides insight into how important that data in the table/index is to the rest of the database.
I get into conversations with client’s often about what data is most important in their database systems and depending on who I talk to this view can be quite different. In my mind though, the tables/indexes that are in memory are the most important. They are the most important because the applications and services using the database felt they were important (because of queries and donuts) to load them from disk to memory.
But what about when these aren’t the important tables/indexes in the database? That means there is a lot of work to do to add appropriate indexes, tune up some of the queries, or archive off some inappropriate data.
I mention the last one because of a conversation I ran into a while back. A friend had shared with me a situation where the largest tables in memory across one of his server’s were the msdb backup tables. A few years of backup entries for every 15 minutes across a number of databases wasn’t really needed. But the backup jobs forced them up there every 15 when the backups occurred. Loading all of that data into memory caused other data to be pushed out and caused some of the memory pressure that had been occurring.
Onto the next version…
Today’s version of the will include both the current size of the index and the portion of the index that is currently in memory. To determine the size of the index, I used the following post. But for the amount of the index I had to look around a bit and found an post on SQLTeam that aggregated the information that I was looking for.
The version below combines the new components and the fixes from above:
DECLARE @ObjectID int SELECT @ObjectID = OBJECT_ID('Person.Contact') ;WITH IndexSize AS( SELECT object_id ,index_id ,CAST((SUM(reserved_page_count) * CAST(8 as float))/1024 as decimal(12,2)) as size_in_mb FROM sys.dm_db_partition_stats GROUP BY object_id, index_id ) ,preIndexAnalysis AS ( SELECT OBJECT_SCHEMA_NAME(t.object_id) as schema_name ,t.name as table_name ,COALESCE(i.name, 'N/A') as index_name ,CASE WHEN i.is_unique = 1 THEN 'UNIQUE ' ELSE '' END + i.type_desc as type_desc ,iz.size_in_mb ,NULL as impact ,ROW_NUMBER() OVER (PARTITION BY i.object_id ORDER BY i.is_primary_key desc, ius.user_seeks + ius.user_scans + ius.user_lookups desc) as ranking ,ius.user_seeks + ius.user_scans + ius.user_lookups as user_total ,COALESCE(CAST(100 * (ius.user_seeks + ius.user_scans + ius.user_lookups) /(NULLIF(SUM(ius.user_seeks + ius.user_scans + ius.user_lookups) OVER(PARTITION BY i.object_id), 0) * 1.) as decimal(6,2)),0) as user_total_pct ,ius.user_seeks ,ius.user_scans ,ius.user_lookups ,STUFF((SELECT ', ' + QUOTENAME(c.name) FROM sys.index_columns ic INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id WHERE i.object_id = ic.object_id AND i.index_id = ic.index_id AND is_included_column = 0 ORDER BY key_ordinal ASC FOR XML PATH('')), 1, 2, '') AS indexed_columns ,STUFF((SELECT ', ' + QUOTENAME(c.name) FROM sys.index_columns ic INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id WHERE i.object_id = ic.object_id AND i.index_id = ic.index_id AND is_included_column = 1 ORDER BY key_ordinal ASC FOR XML PATH('')), 1, 2, '') AS included_columns ,i.object_id ,i.index_id ,(SELECT QUOTENAME(ic.column_id,'(') FROM sys.index_columns ic WHERE i.object_id = ic.object_id AND i.index_id = ic.index_id AND is_included_column = 0 ORDER BY key_ordinal ASC FOR XML PATH('')) AS indexed_columns_compare ,COALESCE((SELECT QUOTENAME(ic.column_id, '(') FROM sys.index_columns ic WHERE i.object_id = ic.object_id AND i.index_id = ic.index_id AND is_included_column = 1 ORDER BY key_ordinal ASC FOR XML PATH('')), SPACE(0)) AS included_columns_compare FROM sys.tables t INNER JOIN sys.indexes i ON t.object_id = i.object_id LEFT OUTER JOIN sys.dm_db_index_usage_stats ius ON i.object_id = ius.object_id AND i.index_id = ius.index_id AND ius.database_id = db_id() INNER JOIN IndexSize iz ON i.object_id = iz.object_id AND i.index_id = iz.index_id WHERE t.object_id = @ObjectID OR @ObjectID IS NULL UNION ALL SELECT OBJECT_SCHEMA_NAME(mid.object_id) as schema_name ,OBJECT_NAME(mid.object_id) as table_name ,'--MISSING--' ,'--NONCLUSTERED--' ,NULL ,(migs.user_seeks + migs.user_scans) * migs.avg_user_impact as impact ,0 as ranking ,migs.user_seeks + migs.user_scans as user_total ,NULL as user_total_pct ,migs.user_seeks ,migs.user_scans ,0 as user_lookups ,COALESCE(equality_columns + ', ', SPACE(0)) + COALESCE(inequality_columns, SPACE(0)) as indexed_columns ,included_columns ,mid.object_id ,NULL ,NULL ,NULL FROM sys.dm_db_missing_index_details mid INNER JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle INNER JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle WHERE database_id = db_id() AND (mid.object_id = @ObjectID OR @ObjectID IS NULL) ) ,ForeignKeys AS ( SELECT fk.name + '|PARENT' AS name ,fkc.parent_object_id AS object_id ,STUFF((SELECT ', ' + QUOTENAME(c.name) FROM sys.foreign_key_columns ifkc INNER JOIN sys.columns c ON ifkc.parent_object_id = c.object_id AND ifkc.parent_column_id = c.column_id WHERE fk.object_id = ifkc.constraint_object_id ORDER BY ifkc.constraint_column_id FOR XML PATH('')), 1, 2, '') AS fk_columns ,(SELECT QUOTENAME(ifkc.parent_column_id,'(') FROM sys.foreign_key_columns ifkc WHERE fk.object_id = ifkc.constraint_object_id ORDER BY ifkc.constraint_column_id FOR XML PATH('')) AS fk_columns_compare FROM sys.foreign_keys fk INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id WHERE fkc.constraint_column_id = 1 AND (fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL) UNION ALL SELECT fk.name + '|REFERENCED' as name ,fkc.referenced_object_id AS object_id ,STUFF((SELECT ', ' + QUOTENAME(c.name) FROM sys.foreign_key_columns ifkc INNER JOIN sys.columns c ON ifkc.referenced_object_id = c.object_id AND ifkc.referenced_column_id = c.column_id WHERE fk.object_id = ifkc.constraint_object_id ORDER BY ifkc.constraint_column_id FOR XML PATH('')), 1, 2, '') AS fk_columns ,(SELECT QUOTENAME(ifkc.referenced_column_id,'(') FROM sys.foreign_key_columns ifkc WHERE fk.object_id = ifkc.constraint_object_id ORDER BY ifkc.constraint_column_id FOR XML PATH('')) AS fk_columns_compare FROM sys.foreign_keys fk INNER JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id WHERE fkc.constraint_column_id = 1 AND (fkc.parent_object_id = @ObjectID OR @ObjectID IS NULL) ) ,MemoryBuffer AS ( SELECT obj.object_id ,obj.index_id ,COUNT(*)AS Buffered_Page_Count ,CAST(COUNT(*) as bigint)*CAST(8 as float)/1024 as Buffer_MB FROM sys.dm_os_buffer_descriptors AS bd INNER JOIN ( SELECT object_name(object_id) AS name ,index_id ,allocation_unit_id, object_id FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3) UNION ALL SELECT object_name(object_id) AS name ,index_id, allocation_unit_id, object_id FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND au.type = 2 ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id WHERE database_id = db_id() GROUP BY obj.object_id ,obj.index_id ) , IndexAnalysis AS ( SELECT ia.object_id ,ia.index_id ,schema_name ,table_name ,index_name ,type_desc ,size_in_mb ,impact ,ranking ,user_total ,user_total_pct ,CAST(100 * (user_seeks + user_scans + user_lookups) /(NULLIF(SUM(user_seeks + user_scans + user_lookups) OVER(PARTITION BY schema_name, table_name), 0) * 1.) as decimal(6,2)) as estimated_percent ,user_seeks ,user_scans ,user_lookups ,indexed_columns ,included_columns ,STUFF((SELECT ', ' + index_name AS [data()] FROM preIndexAnalysis iia WHERE ia.object_id = iia.object_id AND ia.index_id iia.index_id AND ia.indexed_columns_compare = iia.indexed_columns_compare AND ia.included_columns_compare = iia.included_columns_compare FOR XML PATH('')), 1, 2, '') AS duplicate_indexes ,STUFF((SELECT ', ' + index_name AS [data()] FROM preIndexAnalysis iia WHERE ia.object_id = iia.object_id AND ia.index_id iia.index_id AND (ia.indexed_columns_compare LIKE iia.indexed_columns_compare + '%' OR iia.indexed_columns_compare LIKE ia.indexed_columns_compare + '%') AND ia.indexed_columns_compare iia.indexed_columns_compare FOR XML PATH('')), 1, 2, '') AS overlapping_indexes ,STUFF((SELECT ', ' + name AS [data()] FROM ForeignKeys ifk WHERE ifk.object_id = ia.object_id AND ia.indexed_columns_compare LIKE ifk.fk_columns_compare + '%' FOR XML PATH('')), 1, 2, '') AS related_foreign_keys ,CAST((SELECT name FROM ForeignKeys WHERE ForeignKeys.object_id = ia.object_id AND ia.indexed_columns_compare LIKE ForeignKeys.fk_columns_compare + '%' FOR XML AUTO) as xml) AS related_foreign_keys_xml FROM preIndexAnalysis ia UNION ALL SELECT fk.object_id ,NULL ,OBJECT_SCHEMA_NAME(fk.object_id) AS schema_name ,OBJECT_NAME(fk.object_id) AS table_name ,fk.name AS index_name ,'--MISSING FOREIGN KEY--' ,NULL ,NULL ,NULL ,NULL ,NULL ,NULL ,NULL ,NULL ,NULL ,fk.fk_columns ,NULL ,NULL ,NULL ,NULL ,NULL FROM ForeignKeys fk LEFT OUTER JOIN preIndexAnalysis ia ON fk.object_id = ia.object_id AND ia.indexed_columns_compare LIKE fk.fk_columns_compare + '%' WHERE ia.index_name IS NULL ) SELECT schema_name ,table_name ,index_name ,type_desc ,size_in_mb ,mb.Buffer_MB ,CAST(100*mb.Buffer_MB/NULLIF(size_in_mb,0) AS decimal(6,2)) AS buffer_pct ,impact ,user_total ,user_total_pct ,estimated_percent ,user_seeks ,user_scans ,user_lookups ,indexed_columns ,included_columns ,duplicate_indexes ,overlapping_indexes ,related_foreign_keys FROM IndexAnalysis ia LEFT OUTER JOIN MemoryBuffer mb ON ia.object_id = mb.object_id AND ia.index_id = mb.index_id ORDER BY SUM(Buffer_MB ) OVER (PARTITION BY ia.object_id) desc, table_name, user_total desc, buffer_mb
NOTE: Populating the table name in the function for the @ObjectID variable is optional. If the value of @ObjectID is NULL then information on all tables is returned.
This version of the DMV query returns the following columns:
- schema_name: Schema name
- table_name: Table name
- index_name: Index Name
- type_desc: Type of index; either clustered or non-clustered.
- size_in_mb: Then number of MB the index utilizes on disk.
- buffer_mb: The number of MB in the SQL Server buffer being utilized by the index.
- buffer_pct: The percentage of an index that is current in the SQL Server buffer.
- impact: Calculation of impact of a potential index. This is based on the seeks and scans that the index could have utilized multiplied by average improvement the index would have provided. This is included only for missing indexes.
- user_total: Total number of seek, scan, and lookup operations for the index
- user_total_pct: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing indexes for the table.
- estimated_percent: Percentage of total number of seek, scan, and lookup operations for this index compared to all seek, scan, and lookup operations for existing and potential indexes for the table. This number is naturally skewed because a seek for potential Index A resulted in another operation on an existing index and both of these operations would be counted.
- user_seeks: Number of seek operations on the index.
- user_scans: Number of scan operations on the index.
- user_lookups: Number of lookup operations on the index.
- indexed_columns: Columns that are part of the index, missing index or foreign key.
- included_columns: Columns that are included in the index or missing index.
- duplicate_indexes: List of Indexes that exist on the table that are identical to the index on this row.
- overlapping_indexes: List of Indexes that exist on the table that overlap the index on this row.
- related_foreign_keys: List of foreign keys that are related to the index either as an exact match or covering index.
- related_foreign_keys_xml: XML document listing foreign keys that are related to the index either as an exact match or covering index.
To review the results of the new components added, I ran a query that caused a table scan against Person.Contact. As in the previous post, I’ve removed the columns from the image that do not pertain to this output.
The output shows that nearly the entire Person.Contact.PK_Contact_ContactID index has been moved into memory. The other two indexes are not utilizing that much space in memory.
One last thing, the ORDER BY statement has been changed to order all of the tables the total amount of space all of the indexes for the table are currently utilizing in memory. Through this the tables with the current highest buffer usage will bubble to the top.
Ok, one more “one last thing”… buffer utilizing of the index is highly dependent on the most recent usage of the database. Out of character queries can change the data in the buffer and skew results. A single execution of this index analysis should not be used to state definitively whether a table or index is moved into the buffer that often.
8 thoughts on “Analyze This – Analyze Your Indexes – Part 4”
Comments are closed.