One misconception that I see a lot deals with how data is stored in a clustered index. Specifically – is the data in a clustered index stored on the page in physical order? Most people will say “Yes”, and they get this from the definition (see BOL: Clustered and Nonclustered Indexes Described), which states:
Clustered indexes sort and store the data rows in the table or view based on their key values.
Does this mean that the data is stored in strict physical order, that is Row1 is always followed by Row2 which is always followed by Row3? Well, let us investigate this in a bit more depth.
How does a page work?
The first thing to understand is the layout of a data page. (A data page is what SQL Server uses to store the data in a heap (a table without a clustered index), and in the leaf level of a clustered index. If we look at BOL (Understanding Pages and Extents), it has a nice little diagram that shows the basic layout of a data page:
The key thing to point out in the data page is that the Row Offset Array (also known as the Slot Array) contains the offset on the page where that data row exists. The Slot Array will always be in the index key order, but depending on the data and how it changes, the actual row of data can move around on the page. (And yes, I do see that the sentence immediately prior to the picture that states “The entries in the row offset table are in reverse sequence from the sequence of the rows on the page.” But, as we will soon find out, this is only true when records are initially added to this page.)
Time to test!
So, let’s see what is happening. Let’s start off by creating a new database to play around in, and making a table in this database to use:
Next, insert 5 rows into the PageTest table. (Each row will use 500 of the 1000 characters available in the Col1 column.)
The next step is to get the page id for this table in the database:
Get the PagePID value for the row with a NOT NULL IAMFID, and examine the contents of that page using the DBCC PAGE command (Note that for my example, the PagePID has a value of 264 – if you get a different value, use your value in all of the subsesquent DBCC PAGE commands. Additionally, DBCC PAGE sends its results to the SQL Log. To get the results to return to the screen, turn trace flag 3604 on as shown below.)
(Note: DBCC PAGE is not documented in BOL, but you can find out how it works by clicking here.)
As you examine the contents of the page, pay particular attention to the slot array, and the page offset for where the row data starts at:
12345 Slot 0 Offset 0x60 Length 515Slot 1 Offset 0x263 Length 515Slot 2 Offset 0x466 Length 515Slot 3 Offset 0x669 Length 515Slot 4 Offset 0x86c Length 515
(Note that the array is zero-based, so the third row is in Slot 2.)
12345 Slot 0 Offset 0x60 Length 515Slot 1 Offset 0x263 Length 515Slot 2 Offset 0xa6f Length 1015Slot 3 Offset 0x669 Length 515Slot 4 Offset 0x86c Length 515
Filling the gap
This gap can be used by SQL if it attempts to add a new row into this page, and there is not enough room at the end of the page, but there is room on the page to fit this row. SQL will move all of the existing rows up to utilize the unused space, and add the new row at the end of the page where the consolidated unused space is now at. Let’s watch this in action by first adding several rows of data to the table (but just short of the number necessary to cause this consolidation) and examining the page again:
12345678910111213 Slot 0 Offset 0x60 Length 515Slot 1 Offset 0x263 Length 515Slot 2 Offset 0xa6f Length 1015Slot 3 Offset 0x669 Length 515Slot 4 Offset 0x86c Length 515Slot 5 Offset 0xe66 Length 515Slot 6 Offset 0x1069 Length 515Slot 7 Offset 0x126c Length 515Slot 8 Offset 0x146f Length 515Slot 9 Offset 0x1672 Length 515Slot 10 Offset 0x1875 Length 515Slot 11 Offset 0x1a78 Length 515Slot 12 Offset 0x1c7b Length 515
1234567891011121314 Slot 0 Offset 0x60 Length 515Slot 1 Offset 0x263 Length 515Slot 2 Offset 0x86c Length 1015Slot 3 Offset 0x466 Length 515Slot 4 Offset 0x669 Length 515Slot 5 Offset 0xc63 Length 515Slot 6 Offset 0xe66 Length 515Slot 7 Offset 0x1069 Length 515Slot 8 Offset 0x126c Length 515Slot 9 Offset 0x146f Length 515Slot 10 Offset 0x1672 Length 515Slot 11 Offset 0x1875 Length 515Slot 12 Offset 0x1a78 Length 515Slot 13 Offset 0x1c7b Length 515
What we have looked at so far is modifying the non-key data that causes the row location in the page to change. Based on what has been seen so far, we can see that the actual data was inserted at the unused space at the end of the page. The slot array is kept in the proper physical order. The key data hasn’t been changed, and the keys were automatically generated through an IDENTITY column, causing the slot array to be populated in the proper order. But things don’t always work this way.
Inserting non-sequential key data
Let’s look into what happens if we insert the data that is not in the order of the key – something that is likely to happen with a natural key that doesn’t have an ever-increasing value.
For this, let’s drop the table, and recreate it without the identity column. Insert some data with a gap in the key. Examine the slot array on the page, then insert the missing value and examine the slot array again.
1234 Slot 0 Offset 0x60 Length 515Slot 1 Offset 0x263 Length 515Slot 2 Offset 0x466 Length 515Slot 3 Offset 0x669 Length 515
12345 Slot 0 Offset 0x60 Length 515Slot 1 Offset 0x263 Length 515Slot 2 Offset 0x86c Length 515Slot 3 Offset 0x466 Length 515Slot 4 Offset 0x669 Length 515
The slot array is always maintained in key value order, and the rows on each page are those that should be on that page as reflected by the key value of the row. However, the physical storage of the row on that page is not necessarily maintained in key value order. Changes to data so that the data no longer fits into its original space, or inserts that are not in the key value order can cause the actual data to not be physically stored in the key order and will be stored at the beginning of the unused space at the end of the page. When the unused space is consolidated, existing rows are moved up the page, but they are not resorted.