While you may not need to worry about the physical location of a row very often, every so often the need comes up. In a few of my earlier posts (here), I have needed this information to prove a point. This post shows how to identify a row’s physical location. It will also show the various methods used to work with this value once returned.
Acquiring the physical location of a row
SQL Server 2008 introduced a new virtual system column: “%%physloc%%”. “%%physloc%%” returns the file_id, page_id and slot_id information for the current row, in a binary format. Thankfully, SQL Server also includes a couple of functions to split this binary data into a more useful format. Unfortunately, Microsoft has not documented either the column or the functions.
The first of these functions is “sys.fn_PhysLocFormatter”. This scalar function receives the %%physloc%% virtual column as an input. It splits the input apart into the file, page and slot information, and then returns the information in a column formatted in the form file_id:page_id:slot_id. This is the function that I used in the above posts.
The second function is “sys.fn_PhysLocCracker”. As a table-valued function, it is used either directly in the FROM clause or by using the cross apply operator. It also receives the %%physloc%% virtual column as an input, and returns the file_id, page_id and slot_id as columns in a table. In at least one of my previous posts, this would have been a better function to have used.
Example usage:
Let’s see some examples of using the virtual column and these functions. In the following script, we start off by creating a temporary table with 700 rows of data. The first query after that uses the sys.fn_PhysLocFormatter function, returning all of the physical location information about the row into one column. The subsequent query uses the sys.fn_PhysLocCracker function, returning the file_id, page_id and slot_id values from the physical location of the row. This allows these values to be used in subsequent processing – say maybe for automating DBCC Page?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
IF OBJECT_ID('tempdb.dbo.#TestTable') IS NOT NULL DROP TABLE #TestTable; WITH Tens (N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1), Hundreds(N) AS (SELECT 1 FROM Tens t1, Tens t2), Millions(N) AS (SELECT 1 FROM Hundreds t1 CROSS JOIN Hundreds t2 CROSS JOIN Hundreds t3), Tally (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM Millions) SELECT TOP (700) REPLICATE('Wayne''s World Rocks(' + RIGHT('000'+ CONVERT(VARCHAR(15), N), 3) + '!)', 48) AS Column1 INTO #TestTable FROM Tally; SELECT *, sys.fn_PhysLocFormatter(%%physloc%%) AS [File:Page:Slot] FROM #TestTable; SELECT * FROM #TestTable CROSS APPLY sys.fn_PhysLocCracker(%%physloc%%); |
This returns:
So there you go. %%physloc%% identifies the physical location of a row. Then the sys.fn_PhysLocFormatter function will decipher this and expose it in a composite form, while the sys.fn_PhysLocCracker function will decipher it and return the values as columns in a result set. You can use whichever one suits your needs.
Performance
When I ran these statements with the “Include Actual Execution Plans” option enabled, I saw:
The bottom statement has the infamous “Table Valued Function” operator in it. SQL Server uses this operator when dealing with a multi-statement table-valued function. If you happened to have read my recent post comparing functions, you will recall that these are the worst performing type of functions. So I decided to look at the code of these functions, to see if sys.fn_PhysLocCracker could be improved upon. Running sp_helptext on these functions returns the following code:
sys.fn_PhysLocFormatter
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
Text --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------- -- Name: sys.fn_PhysLocFormatter -- -- Description: -- Formats the output of %%physloc%% virtual column -- -- Notes: ------------------------------------------------------------------------------- create function sys.fn_PhysLocFormatter (@physical_locator binary (8)) returns varchar (128) as begin declare @page_id binary (4) declare @file_id binary (2) declare @slot_id binary (2) -- Page ID is the first four bytes, then 2 bytes of page ID, then 2 bytes of slot -- select @page_id = convert (binary (4), reverse (substring (@physical_locator, 1, 4))) select @file_id = convert (binary (2), reverse (substring (@physical_locator, 5, 2))) select @slot_id = convert (binary (2), reverse (substring (@physical_locator, 7, 2))) return '(' + cast (cast (@file_id as int) as varchar) + ':' + cast (cast (@page_id as int) as varchar) + ':' + cast (cast (@slot_id as int) as varchar) + ')' end |
sys.fn_PhysLocCracker
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
Text --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------- -- Name: sys.fn_PhysLocCracker -- -- Description: -- Cracks the output of %%physloc%% virtual column -- -- Notes: ------------------------------------------------------------------------------- create function sys.fn_PhysLocCracker (@physical_locator binary (8)) returns @dumploc_table table ( [file_id] int not null, [page_id] int not null, [slot_id] int not null ) as begin declare @page_id binary (4) declare @file_id binary (2) declare @slot_id binary (2) -- Page ID is the first four bytes, then 2 bytes of page ID, then 2 bytes of slot -- select @page_id = convert (binary (4), reverse (substring (@physical_locator, 1, 4))) select @file_id = convert (binary (2), reverse (substring (@physical_locator, 5, 2))) select @slot_id = convert (binary (2), reverse (substring (@physical_locator, 7, 2))) insert into @dumploc_table values (@file_id, @page_id, @slot_id) return end |
It seems obvious to me that the multi-statement function was created by cloning the scalar function. Furthermore, this code can be improved upon and converted into an inline table-valued function. An inline table-valued function for this functionality would look like this:
1 2 3 4 5 6 7 |
CREATE FUNCTION sys.fn_PhysLocCracker (@physical_locator binary (8)) RETURNS TABLE AS RETURN SELECT CONVERT(INTEGER, CONVERT (BINARY (4), REVERSE (SUBSTRING (@physical_locator, 1, 4)))) AS page_id, CONVERT(INTEGER, CONVERT (BINARY (2), REVERSE (SUBSTRING (@physical_locator, 5, 2)))) AS file_id, CONVERT(INTEGER, CONVERT (BINARY (2), REVERSE (SUBSTRING (@physical_locator, 7, 2)))) AS slot_id; |
Now, since this is an undocumented function, I don’t expect that this function will ever be upgraded. But if you ever need it, you could create it yourself. Alternatively, you could just code it into the cross apply operator yourself, like so:
1 2 3 4 5 |
SELECT * FROM #TestTable CROSS APPLY (SELECT CONVERT(INTEGER, CONVERT (BINARY (4), REVERSE (SUBSTRING (%%physloc%%, 1, 4)))) AS page_id, CONVERT(INTEGER, CONVERT (BINARY (2), REVERSE (SUBSTRING (%%physloc%%, 5, 2)))) AS file_id, CONVERT(INTEGER, CONVERT (BINARY (2), REVERSE (SUBSTRING (%%physloc%%, 7, 2)))) AS slot_id) ca; |
Which will return the same information, without the performance inhibiting multi-statement table-valued function being used.