A while back, I wrote a blog post showing the differences (including performance) between Inline Table-Valued Functions (ITVF) and Multi-Statement Table-Valued Functions (MSTVF). Since then, whenever I speak about ITVFs / MSTVFs, I’m frequently asked about how Scalar Functions (SF) perform. My stock answer has always been that they perform similarly to a MSTVF, and that you should just try to use ITVFs whenever possible.
In this post, I’m going to do a head-to-head-to-head comparison of an ITVF, MSTVF and SF, all performing the same task. To show just the performance difference, the code inside the three functions is the same – the only differences is that necessary for that specific type of function.
For this test, I’m going to create a table with a VARCHAR(30) column. This column will hold numeric and some non-numeric characters. The test will be to return just the numeric characters from this string, in the order that they appear in the string.
The Test Data
To do a true performance test, you can’t test against just a few rows… you need to test against a bunch. So, let’s create a million row test table. I’ll do this in tempdb. The following script will create a column with numeric and non-numeric characters in it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
USE tempdb; GO IF OBJECT_ID('tempdb.dbo.#temp1', 'U') IS NOT NULL DROP TABLE #temp1; WITH cteSymbols AS ( SELECT CharSymbol FROM (VALUES ('('), ('('), ('-'), (' '), ('-('), (')-'), ('.'), (','), ('/'), ('@')) dt (CharSymbol) ) SELECT TOP (1000000) IDENTITY(INTEGER) AS RowID, CONVERT(VARCHAR(30), s1.CharSymbol + CONVERT(VARCHAR(8), ABS(so1.object_id) % 10000000,0) + s2.CharSymbol + CONVERT(VARCHAR(8), ABS(so2.object_id) % 10000000,0) + s3.CharSymbol ) AS StringOfNumbersWithNonNumbers INTO #temp1 FROM cteSymbols s1 CROSS JOIN cteSymbols s2 CROSS JOIN cteSymbols s3 CROSS JOIN sys.all_objects so1 CROSS JOIN sys.all_objects so2; -- let's look at a few rows to see what we have SELECT * FROM #temp1 WHERE RowID <= 10; GO |
A Sampling of the output to see what we’re working with:
RowID | StringOfNumbersWithNonNumbers |
---|---|
1 | (4925397(4925397( |
2 | (8054770(4925397( |
3 | (2054713(4925397( |
4 | (6054656(4925397( |
5 | (6925511(4925397( |
6 | (925454(4925397( |
7 | (925169(4925397( |
8 | (2317107(4925397( |
9 | (9578706(4925397( |
10 | (5578763(4925397( |
The Functions
The next step is to create the functions that the test will utilize:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
-- Inline table valued function IF OBJECT_ID('dbo.ITVF_TEST') IS NOT NULL DROP FUNCTION dbo.ITVF_TEST; GO CREATE FUNCTION dbo.ITVF_TEST (@Input VARCHAR(30)) RETURNS TABLE WITH SCHEMABINDING AS RETURN SELECT (SELECT SUBSTRING(@Input,N,1) FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30)) AS x(N) WHERE N<=LEN(@Input) AND SUBSTRING(@Input,N,1) LIKE ('[0-9]') ORDER BY N FOR XML PATH(''), TYPE).value('.','VARCHAR(30)') AS StringNumbersOnly; GO -- Multi-statement table valued function IF OBJECT_ID('dbo.MSTVF_TEST') IS NOT NULL DROP FUNCTION dbo.MSTVF_TEST; GO CREATE FUNCTION dbo.MSTVF_TEST (@Input VARCHAR(30)) RETURNS @Output TABLE (StringNumbersOnly VARCHAR(30)) WITH SCHEMABINDING AS BEGIN INSERT INTO @Output (StringNumbersOnly) SELECT (SELECT SUBSTRING(@Input,N,1) FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30)) AS x(N) WHERE N<=LEN(@Input) AND SUBSTRING(@Input,N,1) LIKE ('[0-9]') ORDER BY N FOR XML PATH(''), TYPE).value('.','VARCHAR(30)') AS StringNumbersOnly; RETURN; END; GO -- Scalar function IF OBJECT_ID('dbo.SF_TEST') IS NOT NULL DROP FUNCTION dbo.SF_TEST; GO CREATE FUNCTION dbo.SF_TEST (@Input VARCHAR(30)) RETURNS VARCHAR(30) WITH SCHEMABINDING AS BEGIN DECLARE @Output VARCHAR(30); SELECT @Output = (SELECT SUBSTRING(@Input,N,1) FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30)) AS x(N) WHERE N<=LEN(@Input) AND SUBSTRING(@Input,N,1) LIKE ('[0-9]') ORDER BY N FOR XML PATH(''), TYPE).value('.','VARCHAR(30)'); RETURN @Output; END; GO |
Now that we have the data and functions, let’s look at what the functions are returning, and ensure that they are returning the same values:
1 2 3 4 5 6 7 8 9 10 |
SELECT t.RowID, t.StringOfNumbersWithNonNumbers AS StringOfNumbers, dbo.SF_TEST(t.StringOfNumbersWithNonNumbers) AS ScalarFunction, MSTVF.StringNumbersOnly AS MultiStatementTVF, ITVF.StringNumbersOnly AS InlineTVF FROM #temp1 t CROSS APPLY dbo.MSTVF_TEST(t.StringOfNumbersWithNonNumbers) MSTVF CROSS APPLY dbo.ITVF_TEST(t.StringOfNumbersWithNonNumbers) ITVF WHERE t.RowID <= 10 ORDER BY t.RowID; |
StringOfNumbers | ScalarFunction | MultiStatementTVF | InlineTVF |
---|---|---|---|
(4925397(4925397( | 49253974925397 | 49253974925397 | 49253974925397 |
(8054770(4925397( | 80547704925397 | 80547704925397 | 80547704925397 |
(2054713(4925397( | 20547134925397 | 20547134925397 | 20547134925397 |
(6054656(4925397( | 60546564925397 | 60546564925397 | 60546564925397 |
(6925511(4925397( | 69255114925397 | 69255114925397 | 69255114925397 |
(925454(4925397( | 9254544925397 | 9254544925397 | 9254544925397 |
(925169(4925397( | 9251694925397 | 9251694925397 | 9251694925397 |
(2317107(4925397( | 23171074925397 | 23171074925397 | 23171074925397 |
(9578706(4925397( | 95787064925397 | 95787064925397 | 95787064925397 |
(5578763(4925397( | 55787634925397 | 55787634925397 | 55787634925397 |
The Performance Test
We can see that the data being returned by each of the functions is the same, so it’s time to compare the performance of the functions. This test will create a table to hold times for each function and then will run each function individually against all of the rows in the table (inserting into the table just before and immediately after running the query). The results of the query calling the function will be sent to a new temporary table. When the testing is finished, we’ll get the duration for each function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
-- PERFORMANCE TESTING TIME!!! -- table to store the times: IF OBJECT_ID('tempdb.dbo.#temp2') IS NOT NULL DROP TABLE #temp2; CREATE TABLE #temp2 ( RowID INTEGER IDENTITY, FunctionName sysname, ActionDateTime DATETIME2(7) NOT NULL DEFAULT(SYSDATETIME())); --------- INLINE TABLE-VALUED FUNCTION ----------- IF OBJECT_ID('tempdb.dbo.#ITVF_RESULTS') IS NOT NULL DROP TABLE #ITVF_RESULTS; INSERT INTO #temp2 (FunctionName) VALUES ('ITVF_TEST'); SELECT * INTO #ITVF_RESULTS FROM #temp1 CROSS APPLY dbo.ITVF_TEST(StringOfNumbersWithNonNumbers) INSERT INTO #temp2 (FunctionName) VALUES ('ITVF_TEST'); --------- MULTI_STATEMENT TABLE-VALUED FUNCTION ----------- IF OBJECT_ID('tempdb.dbo.#MSTVF_RESULTS') IS NOT NULL DROP TABLE #MSTVF_RESULTS; INSERT INTO #temp2 (FunctionName) VALUES ('MSTVF_TEST'); SELECT * INTO #MSTVF_RESULTS FROM #temp1 CROSS APPLY dbo.MSTVF_TEST(StringOfNumbersWithNonNumbers) INSERT INTO #temp2 (FunctionName) VALUES ('MSTVF_TEST'); --------- SCALAR FUNCTION ------------ IF OBJECT_ID('tempdb.dbo.#SF_Results') IS NOT NULL DROP TABLE #SF_Results; INSERT INTO #temp2 (FunctionName) VALUES ('SF_TEST'); SELECT *, dbo.SF_TEST(StringOfNumbersWithNonNumbers) AS StringNumbersOnly INTO #SF_Results FROM #temp1; INSERT INTO #temp2 (FunctionName) VALUES ('SF_TEST'); ---------- Show the testing results ----------- WITH cte AS ( SELECT t.FunctionName, DATEDIFF(MICROSECOND, MIN(t.ActionDateTime), MAX(t.ActionDateTime)) AS [Duration (microseconds)] FROM #temp2 t GROUP BY t.FunctionName ) SELECT FunctionName, [Duration (microseconds)], CONVERT(NUMERIC(5,2), (cte.[Duration (microseconds)] * 1.0 / SUM(cte.[Duration (microseconds)]) OVER () * 1.0) * 100.0) AS PercentOfBatch FROM cte ORDER BY [Duration (microseconds)]; |
The Results
FunctionName | Duration (microseconds) | PercentOfBatch |
---|---|---|
ITVF_TEST | 11,961,481 | 4.36 |
SF_TEST | 86,042,922 | 31.37 |
MSTVF_TEST | 176,248,081 | 64.26 |
Which one of these would you prefer to be using? The ITVF is clearly the best performing. However, I expected to see the SF and the MSTVF being closer – the SF is twice as fast as the MSTVF. Notice that the MSTVF is an entire order of magnitude slower than the ITVF!
But Wait, There’s More!
No, this isn’t a commercial. But before we draw a conclusion, let’s try another test. For this test, let’s just do some simple string concatenation. To start off, we’ll make a test table with three columns, each with 5 random characters. The functions will just concatenate these columns together, with a space between the first two columns and a comma/space between the last two columns (this can be related to building a city+state+zip address line):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
USE tempdb; GO IF OBJECT_ID('tempdb.dbo.#temp1') IS NOT NULL DROP TABLE #temp1; WITH cte (Col) AS ( -- get random characters SELECT CHAR(ABS(CHECKSUM(NEWID())) % 26 + 65) FROM sys.all_columns ) SELECT TOP (1000000) t1.Col + t2.Col + t3.Col + t4.col + t5.Col AS Col1, t5.Col + t3.col + t1.col + t2.col + t4.col AS Col2, t3.Col + t1.Col + t4.Col + t5.Col + t2.Col AS Col3 INTO #temp1 FROM cte t1 CROSS JOIN cte t2 CROSS JOIN cte t3 CROSS JOIN cte t4 CROSS JOIN cte t5; GO IF OBJECT_ID('dbo.ITVF_Test') IS NOT NULL DROP FUNCTION dbo.ITVF_Test; IF OBJECT_ID('dbo.MSTVF_Test') IS NOT NULL DROP FUNCTION dbo.MSTVF_Test; IF OBJECT_ID('dbo.SF_Test') IS NOT NULL DROP FUNCTION dbo.SF_Test; GO CREATE FUNCTION dbo.ITVF_Test (@Col1 VARCHAR(5), @Col2 VARCHAR(5), @Col3 VARCHAR(5)) RETURNS TABLE WITH SCHEMABINDING AS RETURN SELECT @Col1 + ' ' + @Col2 + ', ' + @Col3 AS ReturnString; GO CREATE FUNCTION dbo.MSTVF_Test (@Col1 VARCHAR(5), @Col2 VARCHAR(5), @Col3 VARCHAR(5)) RETURNS @Results TABLE (ReturnString VARCHAR(20)) WITH SCHEMABINDING AS BEGIN INSERT INTO @Results (ReturnString) SELECT @Col1 + ' ' + @Col2 + ', ' + @Col3 AS ReturnString; RETURN END GO CREATE FUNCTION dbo.SF_Test (@Col1 VARCHAR(5), @Col2 VARCHAR(5), @Col3 VARCHAR(5)) RETURNS VARCHAR(20) WITH SCHEMABINDING AS BEGIN DECLARE @ReturnString VARCHAR(20); SET @ReturnString = @Col1 + ' ' + @Col2 + ', ' + @Col3; RETURN @ReturnString; END GO |
For an explanation of how the randomization works, please see this article by SQL Server MVP Jeff Moden.
Once again, let’s compare the results of the functions:
1 2 3 4 5 6 7 8 |
-- Compare results SELECT TOP (10) ITVF.ReturnString AS ITVF, MSTVF.ReturnString AS MSTVF, dbo.SF_Test(Col1, Col2, Col3) AS SF FROM #temp1 CROSS APPLY dbo.ITVF_Test(Col1, Col2, Col3) ITVF CROSS APPLY dbo.MSTVF_Test(Col1, Col2, Col3) MSTVF; |
Which produces these results
ITVF | MSTVF | SF |
---|---|---|
OFOPV VOOFP, OOPVF | OFOPV VOOFP, OOPVF | OFOPV VOOFP, OOPVF |
VFAMP PAVFM, AVMPF | VFAMP PAVFM, AVMPF | VFAMP PAVFM, AVMPF |
MODCD DDMOC, DMCDO | MODCD DDMOC, DMCDO | MODCD DDMOC, DMCDO |
BQQCQ QQBQC, QBCQQ | BQQCQ QQBQC, QBCQQ | BQQCQ QQBQC, QBCQQ |
ZRBSB BBZRS, BZSBR | ZRBSB BBZRS, BZSBR | ZRBSB BBZRS, BZSBR |
SKIRF FISKR, ISRFK | SKIRF FISKR, ISRFK | SKIRF FISKR, ISRFK |
BZRWR RRBZW, RBWRZ | BZRWR RRBZW, RBWRZ | BZRWR RRBZW, RBWRZ |
MPWHR RWMPH, WMHRP | MPWHR RWMPH, WMHRP | MPWHR RWMPH, WMHRP |
JLAWD DAJLW, AJWDL | JLAWD DAJLW, AJWDL | JLAWD DAJLW, AJWDL |
VZBKA ABVZK, BVKAZ | VZBKA ABVZK, BVKAZ | VZBKA ABVZK, BVKAZ |
Okay, the functions are returning the same data, so let’s test their performance.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
-- PERFORMANCE TESTING TIME!!! -- table to store the times: IF OBJECT_ID('tempdb.dbo.#temp2') IS NOT NULL DROP TABLE #temp2; CREATE TABLE #temp2 ( RowID INTEGER IDENTITY, FunctionName sysname, ActionDateTime DATETIME2(7) NOT NULL DEFAULT(SYSDATETIME())); --------- INLINE TABLE-VALUED FUNCTION ----------- IF OBJECT_ID('tempdb.dbo.#ITVF_RESULTS') IS NOT NULL DROP TABLE #ITVF_RESULTS; INSERT INTO #temp2 (FunctionName) VALUES ('ITVF_TEST'); SELECT * INTO #ITVF_RESULTS FROM #temp1 CROSS APPLY dbo.ITVF_TEST(Col1, Col2, Col3) INSERT INTO #temp2 (FunctionName) VALUES ('ITVF_TEST'); --------- MULTI_STATEMENT TABLE-VALUED FUNCTION ----------- IF OBJECT_ID('tempdb.dbo.#MSTVF_RESULTS') IS NOT NULL DROP TABLE #MSTVF_RESULTS; INSERT INTO #temp2 (FunctionName) VALUES ('MSTVF_TEST'); SELECT * INTO #MSTVF_RESULTS FROM #temp1 CROSS APPLY dbo.MSTVF_TEST(Col1, Col2, Col3) INSERT INTO #temp2 (FunctionName) VALUES ('MSTVF_TEST'); --------- SCALAR FUNCTION ------------ IF OBJECT_ID('tempdb.dbo.#SF_Results') IS NOT NULL DROP TABLE #SF_Results; INSERT INTO #temp2 (FunctionName) VALUES ('SF_TEST'); SELECT *, dbo.SF_TEST(Col1, Col2, Col3) AS StringNumbersOnly INTO #SF_Results FROM #temp1; INSERT INTO #temp2 (FunctionName) VALUES ('SF_TEST'); ---------- Show the testing results ----------- WITH cte AS ( SELECT t.FunctionName, DATEDIFF(MICROSECOND, MIN(t.ActionDateTime), MAX(t.ActionDateTime)) AS [Duration (microseconds)] FROM #temp2 t GROUP BY t.FunctionName ) SELECT FunctionName, [Duration (microseconds)], CONVERT(NUMERIC(5,2), (cte.[Duration (microseconds)] * 1.0 / SUM(cte.[Duration (microseconds)]) OVER () * 1.0) * 100.0) AS PercentOfBatch FROM cte ORDER BY [Duration (microseconds)]; |
The results are:
FunctionName | Duration (microseconds) | PercentOfBatch |
ITVF_TEST | 371,022 | 0.40 |
SF_TEST | 3,356,195 | 3.63 |
MSTVF_TEST | 88,856,118 | 95.97 |
Wow, this is even more impressive! Once again, the ITVF blows the other types of functions out of the water, being one order of magnitude faster than the SF and two (almost three) orders of magnitude faster than the MSTVF. The SF readily outperforms the MSTVF, being twenty-five times faster.
The Wrap Up
In the examples shown here, the ITVF clearly performs much better than either an SF or an MSTVF, and the MSTVF is also clearly the worst-performing type of function, being orders of magnitude slower. However, as with all performance testing, you should test all three against a nice, large set of data (at least one million rows) for what you need the function to do, and utilize the best one for your needs.
I guess I’ll be changing my stock answer now… an SF is better than an MSTVF, but not as good as an ITVF.