Index Black Ops Part 5 – Page Splits

Posted by & filed under , , , , , , .

In my TSQL2sDay index summary post, that I’d be writing a few posts on the information that is contained in sys.dm_db_index_operational_stats. The posts are  the following:

For the current post, we are going to look at page splits.  A page split occurs with a data page needs to be updated but the amount of data being placed back on the page exceeds the amount of room.  To make room for the increased amount of data, SQL Server will add a new page to the index and move a portion of the data from the existing page to the new page.

Unfortunately, when page splits occur they create writes and locks that can sometimes affect the concurrency of the database.  SQL Server has to write to new pages and lock the existing pages which can lead to performance issues in some systems.

Tracking Page Splits

For the current post, though, I’m not going to be discussing the how and why of page splits.  The goal is to show where page splits cause values in sys.dm_db_index_operational_stats to increase.

For this we are going to look at a couple columns that were included in the last post in this series.  Those columns are:

  • leaf_allocation_count (bigint) – Cumulative count of leaf-level page allocations in the index or heap.  For an index, a page allocation corresponds to a page split.
  • nonleaf_allocation_count (bigint) – Cumulative count of page allocations caused by page splits above the leaf level.

As the definition states, these columns accumulate the the lead and non-leaf allocations that have been allocated to an index.  When inserts and updates increase the size of the index the number of pages added will be tallied in these columns  And when rows are updated and the updates cause page splits the allocation count will increase.  The definition also states that these columns accumulate data that corresponds to page allocations.

Increment Allocation

Let’s investigate how this data is incremented.  To start with lets take a look at how some typical activity on a table can increment the columns leaf_allocation_count and non_leaf_allocation_count.  To do this execute the following script:


USE tempdb;
GO
IF OBJECT_ID('dbo.PageSplits') IS NOT NULL
DROP TABLE dbo.PageSplits;
CREATE TABLE dbo.PageSplits
(
ID int
,Value varchar(900)
,CreateDate datetime
,CONSTRAINT PK_IndexMaintenance PRIMARY KEY (ID)
);
CREATE INDEX IX_PageSplits_Value ON dbo.PageSplits (Value);
WITH
l0 AS (SELECT 0 AS C UNION ALL SELECT 0),
l1 AS (SELECT 0 AS C FROM L0 AS A CROSS JOIN L0 AS B),
l2 AS (SELECT 0 AS C FROM l1 AS A CROSS JOIN l1 AS B),
l3 AS (SELECT 0 AS C FROM l2 AS A CROSS JOIN l2 AS B),
l4 AS (SELECT 0 AS C FROM l3 AS A CROSS JOIN l3 AS B),
l5 AS (SELECT 0 AS C FROM l4 AS A CROSS JOIN l4 AS B),
nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS N FROM l5)
INSERT INTO dbo.PageSplits
SELECT TOP (10000)
n, REPLICATE('X',200), GETDATE()
FROM nums
ORDER BY 2;
SELECT
OBJECT_SCHEMA_NAME(ios.object_id) + '.' + OBJECT_NAME(ios.object_id) as table_name
,i.name as index_name
,leaf_allocation_count
,nonleaf_allocation_count
FROM sys.dm_db_index_operational_stats(DB_ID(), OBJECT_ID('dbo.PageSplits'),NULL, NULL) ios
INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id;
SELECT
OBJECT_SCHEMA_NAME(ips.object_id) + '.' + OBJECT_NAME(ips.object_id) as table_name
,ips.avg_fragmentation_in_percent
,ips.fragment_count
,page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.PageSplits'),NULL, NULL, 'LIMITED') ips

In this script a table was created and the 10,000 rows were added to the table that has clustered and non-clustered indexes.  The next query reports the data accumulated in sys.dm_db_index_operational_stats for the leaf_allocation_count and nonleaf_allocation_count columns.  With the last query reporting the fragmentation and the number of pages allocated to the table.

The results should look like this:

image

We can see that 286 leaf pages for the clustered index and 271 leaf pages for the non-clustered index have been allocated to the indexes.  From sys.dm_db_index_physical_stats we see that identical numbers of pages are associated with the indexes.  Also, there is some fragmentation listed but not an amount that is of importance.

More Rows, More of the Same

The section title is likely a give away, but lets show the effect of adding more rows to the table.  Execute the next script:


WITH
l0 AS (SELECT 0 AS C UNION ALL SELECT 0),
l1 AS (SELECT 0 AS C FROM L0 AS A CROSS JOIN L0 AS B),
l2 AS (SELECT 0 AS C FROM l1 AS A CROSS JOIN l1 AS B),
l3 AS (SELECT 0 AS C FROM l2 AS A CROSS JOIN l2 AS B),
l4 AS (SELECT 0 AS C FROM l3 AS A CROSS JOIN l3 AS B),
l5 AS (SELECT 0 AS C FROM l4 AS A CROSS JOIN l4 AS B),
nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS N FROM l5)
INSERT INTO dbo.PageSplits
SELECT TOP (10000)
n + 10000, REPLICATE('X',200), GETDATE()
FROM nums
ORDER BY 2;
SELECT
OBJECT_SCHEMA_NAME(ios.object_id) + '.' + OBJECT_NAME(ios.object_id) as table_name
,i.name as index_name
,leaf_allocation_count
,nonleaf_allocation_count
FROM sys.dm_db_index_operational_stats(DB_ID(), OBJECT_ID('dbo.PageSplits'),NULL, NULL) ios
INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id;
SELECT
OBJECT_SCHEMA_NAME(ips.object_id) + '.' + OBJECT_NAME(ips.object_id) as table_name
,ips.avg_fragmentation_in_percent
,ips.fragment_count
,page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.PageSplits'),NULL, NULL, 'LIMITED') ips

This script has a similar output as the first script which will look like the following:

image

And as you would expect, the number of leaf pages allocated for both indexes increases along with the page counts.  Since the new rows are after the first set of rows inserted there isn’t any additional fragmentation encountered.

Let’s Page Split

Now let’s get to the juice.  We want to see how the leaf allocation columns can be used to monitor page split activity.  To cause some page splits to occur, we’ll increase the data in value from 200 to 450 characters.

Run the next script to cause some page splits:


UPDATE dbo.PageSplits
SET Value =  REPLICATE('X',450)
WHERE ID%5 =1
SELECT
OBJECT_SCHEMA_NAME(ios.object_id) + '.' + OBJECT_NAME(ios.object_id) as table_name
,i.name as index_name
,leaf_allocation_count
,nonleaf_allocation_count
FROM sys.dm_db_index_operational_stats(DB_ID(), OBJECT_ID('dbo.PageSplits'),NULL, NULL) ios
INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id
SELECT
OBJECT_SCHEMA_NAME(ips.object_id) + '.' + OBJECT_NAME(ips.object_id) as table_name
,ips.avg_fragmentation_in_percent
,ips.fragment_count
,page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.PageSplits'),NULL, NULL, 'LIMITED') ips

As you can see in the results below, a number of pages were allocated to both indexes.  This allocation coincides with an increase in fragmentation.  This fragmentation was a result of the page splits that occurred on the indexes.  Thus as pages were needed by the index to accommodate for the page splits, the page allocations were tracked by sys.dm_db_index_operational_stats.

image

Fragmentation, Cha, Cha, Cha

Typically when there is a large amount of fragmentation on an index, we’ll be good DBAs and defragment the index to remove the fragmentation.  The fragmentation on the clustered index is over 99.9% so there is definitely a need for defragmentation.

Let’s do that now with the following script:


ALTER INDEX PK_IndexMaintenance ON dbo.PageSplits REORGANIZE;
ALTER INDEX IX_PageSplits_Value ON dbo.PageSplits REORGANIZE;
SELECT
OBJECT_SCHEMA_NAME(ios.object_id) + '.' + OBJECT_NAME(ios.object_id) as table_name
,i.name as index_name
,leaf_allocation_count
,nonleaf_allocation_count
FROM sys.dm_db_index_operational_stats(DB_ID(), OBJECT_ID('dbo.PageSplits'),NULL, NULL) ios
INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id
SELECT
OBJECT_SCHEMA_NAME(ips.object_id) + '.' + OBJECT_NAME(ips.object_id) as table_name
,ips.avg_fragmentation_in_percent
,ips.fragment_count
,page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.PageSplits'),NULL, NULL, 'LIMITED') ips

Viola!  The fragmentation is reduced in the index and number of pages for each index reported by the second query is reduced.  The same amount of data on fewer pages.

Looking at the first set of results, though, the number of pages allocated to the index has increased again.  Not only does the allocation rise when pages are allocated for page splits, but same as when pages are reorganized on the index during defragmentation.

image

Moar Page Splits and Fragmentation

Let’s now cause some more fragmentation for one last item to show about these columns.  Run the following script:


UPDATE dbo.PageSplits
SET Value =  REPLICATE('X',900)
WHERE ID%5 =1
SELECT
OBJECT_SCHEMA_NAME(ios.object_id) + '.' + OBJECT_NAME(ios.object_id) as table_name
,i.name as index_name
,leaf_allocation_count
,nonleaf_allocation_count
FROM sys.dm_db_index_operational_stats(DB_ID(), OBJECT_ID('dbo.PageSplits'),NULL, NULL) ios
INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id
SELECT
OBJECT_SCHEMA_NAME(ips.object_id) + '.' + OBJECT_NAME(ips.object_id) as table_name
,ips.avg_fragmentation_in_percent
,ips.fragment_count
,page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.PageSplits'),NULL, NULL, 'LIMITED') ips

As the results show below, there were some page splits as evidenced by the fragmentation and a large number of pages were allocated to the indexes.  Not really exciting because it was expected.

image

Need to Rebuild

In the previous script to remove fragmentation from the indexes, the REORGANIZE option was used.  It isn’t always best to reorganize an index but it might be better to rebuild the index at times.  Most maintenance scripts examine thresholds to determine when it’s best to REORGANIZE an index or when to REBUILD it.

For the final script we’ll rebuild the two indexes:


ALTER INDEX PK_IndexMaintenance ON dbo.PageSplits REBUILD;
ALTER INDEX IX_PageSplits_Value ON dbo.PageSplits REBUILD;
SELECT
OBJECT_SCHEMA_NAME(ios.object_id) + '.' + OBJECT_NAME(ios.object_id) as table_name
,i.name as index_name
,leaf_allocation_count
,nonleaf_allocation_count
FROM sys.dm_db_index_operational_stats(DB_ID(), OBJECT_ID('dbo.PageSplits'),NULL, NULL) ios
INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id
SELECT
OBJECT_SCHEMA_NAME(ips.object_id) + '.' + OBJECT_NAME(ips.object_id) as table_name
,ips.avg_fragmentation_in_percent
,ips.fragment_count
,page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.PageSplits'),NULL, NULL, 'LIMITED') ips

And the results will now show exactly what we probably weren’t expecting.  Instead of showing the number of pages that were allocated to the index before and after the rebuild, the number of leaf allocations has been reduce to 0.

image

For all intents and purposes the statistics for these columns were wiped out.  With the REBUILD operation, the index was recreated and replaced the existing index.  That statistics are not carried over from the first to the second physical structures.

Conclusion

Based on the activity we’ve seen, the leaf and non-leaf allocations in sys.dm_db_index_operational_stats incremented on more than just page split activities.  The description from Books On Line regarding page splits is a bit deceptive.

The fact that these columns account for more than just page splits is not a terrible thing.  When I consider these columns, I think of them as an accumulation of the how much the index is breathing.  This breathing is measured by the page splits (inhales) and the reorganization of pages (exhales).  The more it breathes, the more IO that the index requires from day-to-day to support the index.  A measurement of this activity is important when considering how to design and maintain a database.

The third point is that the numbers in this DMV have to be taken with a grain of salt.  Typical index maintenance will skew the results when comparing one index to another index.  Reviews of indexes needs to accommodate for this.  Taking snapshots of the index statistics from time to time will help provide meaningful insights into the data this DMV provides.

Last point – if you want a dedicated column to accumulate the page splits on an index then I’d encourage you to up-vote this connect item.

  • Scott R.

    Jason,

    I have enjoyed your series on Index Black OPS. Very informative. I especially like the breathing analogy, with page splits as inhaling and index reorgs as exhaling – clever!

    A minor point to consider: In this post #5 on Page Splits, there is a section just before the Conclusion near the end that reads:

    For all intents and purposes the statistics for these columns were wiped out. … That statistics are not carried over from the first to the second physical structures.

    And in the Conclusions section:

    Taking snapshots of the index statistics from time to time will help provide meaningful insights into the data this DMV provides.

    I had to read this part several times to get your meaning. By “statistics”, I believe you mean the DMV column values that track the index page splits, etc. that you demonstrated throughout the post. However, I (and possibly others) may take a first reading of “statistics” as DB statistics used by the query optimizer for the index or other columns of the table being “wiped out” and “not carried over”. In fact, query optimizer DB statistics are fully rebuilt by such an index rebuild, and as high quality full scan statistics (versus lower quality sampled statistics) to boot! I eventually understood your different meaning of “statistics” here, referring to the “DMV statistics” versus the “DB statistics” that are also indirectly related to this topic.

    You may want to consider a minor revision to this section to bring better clarity for future readers – possibly changing references of “statistics” or “index statistics” to “DMV index statistics” or “DMV statistics”, to avoid ambiguity with “query optimizer DB statistics”. Or it could be just me misinterpreting those terms.

    I don’t recall if your earlier series posts mentioned the impact of index rebuilds on DB statistics “freshness”. Perhaps that related topic is forthcoming in a future series post.

    I mention this not as a dig, but because I enjoyed the post series so much that I didn’t want others to be confused by this small part of the post. I really think your series will benefit many readers at different skill levels.

    Thank you for taking the time and effort to author this series – Bravo!

    Scott R.

    • http://feeds2.feedburner.com/StrateSql jstrate

      Thanks for the feedback. Some of the points made in later posts weren't made in earlier posts. I did plan to edit and update these over time if people found them as interesting as I did while writing them. It seems that people have found them useful so I wouldn't be surprised to see some of your suggestions make it into edits in these posts. Thanks again.

  • Pingback: Tweets that mention Index Black Ops Part 5 – Page Splits | Strate SQL -- Topsy.com

  • Pingback: Index Black Ops Part 6 – Fill Factor vs. Page Splits | Strate SQL

  • Pingback: Index Black Ops Part 4 – Index Overhead and Maintenance | Strate SQL

  • Pingback: SQL Server Central