Comparing Inline and Multi-Statement Table-Valued Functions

2012-02-15 Wayne Sheffield 7 comments

Table-Valued Functions. What a wonderful addition to SQL they make. They take parameters, do some work, and return a result set that can be used in queries. You can select directly against them, or utilize them with the APPLY operator. These are truly versatile additions to SQL -and since you can pass parameters to them, they are like a parametrized view. And we have two different types to work with: Inline Table-Valued Functions (ITVF) and Multi-Statement Table-Valued Functions (MTVF).

But how do they compare with each other? Well, let’s start off by looking at the syntax of each:

CREATE FUNCTION Util.MyITVFunction (@Parameters INT)
RETURNS TABLE
AS
RETURN
SELECT TOP (@Parameters) N
  FROM Util.Tally
 ORDER BY N;
GO

CREATE FUNCTION Util.MyITVFunction (@Parameters INT)

RETURNS TABLE

RETURN

SELECT TOP (@Parameters) N

FROM Util.Tally

ORDER BY N;

CREATE FUNCTION Util.MyMTVFunction (@Parameters INT)
RETURNS @FunctionResultTableVariable TABLE (N INT)
AS
BEGIN
  INSERT INTO @FunctionResultTableVariable (N)
  SELECT TOP (@Parameters) N
    FROM Util.Tally
   ORDER BY N;
  RETURN;
END
GO

CREATE FUNCTION Util.MyMTVFunction (@Parameters INT)

RETURNS @FunctionResultTableVariable TABLE (N INT)

BEGIN

INSERT INTO @FunctionResultTableVariable (N)

SELECT TOP (@Parameters) N

FROM Util.Tally

ORDER BY N;

RETURN;

END

The changes in the syntax are that MTVFs must first declare a table variable that is to be returned. Secondly, the MTVF must have a BEGIN/END block. Third, inside the BEGIN/END block you need code that populates the table variable. And finally, you return from the function. In comparison, the ITVF just returns a select statement – there is no table variable to mess around with, no inserts, no code blocks. Just a SELECT statement.

So, how do these perform? For this comparison, let’s use the example that Microsoft supplies in Books Online (BOL) for the APPLY operator (//technet.microsoft.com/en-us/library/ms175156.aspx):

First, make and populate two tables: Employees and Departments:

--Create Employees table and insert values.
CREATE TABLE Employees
(
    empid   int         NOT NULL
    ,mgrid   int         NULL
    ,empname varchar(25) NOT NULL
    ,salary  money       NOT NULL
    CONSTRAINT PK_Employees PRIMARY KEY(empid)
);
GO
INSERT INTO Employees VALUES(1 , NULL, 'Nancy'   , $10000.00);
INSERT INTO Employees VALUES(2 , 1   , 'Andrew'  , $5000.00);
INSERT INTO Employees VALUES(3 , 1   , 'Janet'   , $5000.00);
INSERT INTO Employees VALUES(4 , 1   , 'Margaret', $5000.00);
INSERT INTO Employees VALUES(5 , 2   , 'Steven'  , $2500.00);
INSERT INTO Employees VALUES(6 , 2   , 'Michael' , $2500.00);
INSERT INTO Employees VALUES(7 , 3   , 'Robert'  , $2500.00);
INSERT INTO Employees VALUES(8 , 3   , 'Laura'   , $2500.00);
INSERT INTO Employees VALUES(9 , 3   , 'Ann'     , $2500.00);
INSERT INTO Employees VALUES(10, 4   , 'Ina'     , $2500.00);
INSERT INTO Employees VALUES(11, 7   , 'David'   , $2000.00);
INSERT INTO Employees VALUES(12, 7   , 'Ron'     , $2000.00);
INSERT INTO Employees VALUES(13, 7   , 'Dan'     , $2000.00);
INSERT INTO Employees VALUES(14, 11  , 'James'   , $1500.00);
GO
--Create Departments table and insert values.
CREATE TABLE Departments
(
    deptid    INT NOT NULL PRIMARY KEY
    ,deptname  VARCHAR(25) NOT NULL
    ,deptmgrid INT NULL REFERENCES Employees
);
GO
INSERT INTO Departments VALUES(1, 'HR',           2);
INSERT INTO Departments VALUES(2, 'Marketing',    7);
INSERT INTO Departments VALUES(3, 'Finance',      8);
INSERT INTO Departments VALUES(4, 'R&D',          9);
INSERT INTO Departments VALUES(5, 'Training',     4);
INSERT INTO Departments VALUES(6, 'Gardening', NULL);

--Create Employees table and insert values.

CREATE TABLE Employees

(

empid int NOT NULL

,mgrid int NULL

,empname varchar(25) NOT NULL

,salary money NOT NULL

CONSTRAINT PK_Employees PRIMARY KEY(empid)

);

INSERT INTO Employees VALUES(1 , NULL, 'Nancy' , $10000.00);

INSERT INTO Employees VALUES(2 , 1 , 'Andrew' , $5000.00);

INSERT INTO Employees VALUES(3 , 1 , 'Janet' , $5000.00);

INSERT INTO Employees VALUES(4 , 1 , 'Margaret', $5000.00);

INSERT INTO Employees VALUES(5 , 2 , 'Steven' , $2500.00);

INSERT INTO Employees VALUES(6 , 2 , 'Michael' , $2500.00);

INSERT INTO Employees VALUES(7 , 3 , 'Robert' , $2500.00);

INSERT INTO Employees VALUES(8 , 3 , 'Laura' , $2500.00);

INSERT INTO Employees VALUES(9 , 3 , 'Ann' , $2500.00);

INSERT INTO Employees VALUES(10, 4 , 'Ina' , $2500.00);

INSERT INTO Employees VALUES(11, 7 , 'David' , $2000.00);

INSERT INTO Employees VALUES(12, 7 , 'Ron' , $2000.00);

INSERT INTO Employees VALUES(13, 7 , 'Dan' , $2000.00);

INSERT INTO Employees VALUES(14, 11 , 'James' , $1500.00);

--Create Departments table and insert values.

CREATE TABLE Departments

(

deptid INT NOT NULL PRIMARY KEY

,deptname VARCHAR(25) NOT NULL

,deptmgrid INT NULL REFERENCES Employees

);

INSERT INTO Departments VALUES(1, 'HR', 2);

INSERT INTO Departments VALUES(2, 'Marketing', 7);

INSERT INTO Departments VALUES(3, 'Finance', 8);

INSERT INTO Departments VALUES(4, 'R&D', 9);

INSERT INTO Departments VALUES(5, 'Training', 4);

INSERT INTO Departments VALUES(6, 'Gardening', NULL);

In this example, most (but not all) of the departments in the Departments table have a manager ID that corresponds to an employee in the Employees table. The following table-valued function accepts an employee ID as an argument and returns that employee and all of his/her subordinates.

CREATE FUNCTION dbo.fn_getsubtree(@empid AS INT)
    RETURNS @TREE TABLE
(
    empid   INT NOT NULL
    ,empname VARCHAR(25) NOT NULL
    ,mgrid   INT NULL
    ,lvl     INT NOT NULL
)
AS
BEGIN
  WITH Employees_Subtree(empid, empname, mgrid, lvl)
  AS
  (
    -- Anchor Member (AM)
    SELECT empid, empname, mgrid, 0
    FROM Employees
    WHERE empid = @empid

    UNION all

    -- Recursive Member (RM)
    SELECT e.empid, e.empname, e.mgrid, es.lvl+1
    FROM Employees AS e
      JOIN Employees_Subtree AS es
        ON e.mgrid = es.empid
  )
  INSERT INTO @TREE
    SELECT * FROM Employees_Subtree;

  RETURN
END
GO

CREATE FUNCTION dbo.fn_getsubtree(@empid AS INT)

RETURNS @TREE TABLE

(

empid INT NOT NULL

,empname VARCHAR(25) NOT NULL

,mgrid INT NULL

,lvl INT NOT NULL

)

BEGIN

WITH Employees_Subtree(empid, empname, mgrid, lvl)

(

-- Anchor Member (AM)

SELECT empid, empname, mgrid, 0

FROM Employees

WHERE empid = @empid

UNION all

-- Recursive Member (RM)

SELECT e.empid, e.empname, e.mgrid, es.lvl+1

FROM Employees AS e

JOIN Employees_Subtree AS es

ON e.mgrid = es.empid

)

INSERT INTO @TREE

SELECT * FROM Employees_Subtree;

RETURN

END

To return all of the subordinates in all levels for the manager of each department, use the following query:

SELECT D.deptid,
       D.deptname,
       D.deptmgrid,
       ST.empid,
       ST.empname,
       ST.mgrid
  FROM Departments AS D
       CROSS APPLY fn_getsubtree(D.deptmgrid) AS ST;

SELECT D.deptid,

D.deptname,

D.deptmgrid,

ST.empid,

ST.empname,

ST.mgrid

FROM Departments AS D

CROSS APPLY fn_getsubtree(D.deptmgrid) AS ST;

Which returns this result set:

deptid	deptname	deptmgrid	empid	empname	mgrid
1	HR	2	2	Andrew	1
1	HR	2	5	Steven	2
1	HR	2	6	Michael	2
2	Marketing	7	7	Robert	3
2	Marketing	7	11	David	7
2	Marketing	7	12	Ron	7
2	Marketing	7	13	Dan	7
2	Marketing	7	14	James	11
3	Finance	8	8	Laura	3
4	R&D	9	9	Ann	3
5	Training	4	4	Margaret	1
5	Training	4	10	Ina	4

This is an MTVF. Now, let’s convert it to an ITVF by removing the table variable declaration, the begin/end block, and the insert statement, and move the RETURN to the start:

CREATE FUNCTION dbo.fn_getsubtreeITVF(@empid AS INT)
    RETURNS TABLE
AS
RETURN
  WITH Employees_Subtree(empid, empname, mgrid, lvl)
  AS
  (
    -- Anchor Member (AM)
    SELECT empid, empname, mgrid, 0
    FROM Employees
    WHERE empid = @empid

    UNION all

    -- Recursive Member (RM)
    SELECT e.empid, e.empname, e.mgrid, es.lvl+1
    FROM Employees AS e
      JOIN Employees_Subtree AS es
        ON e.mgrid = es.empid
  )
    SELECT * FROM Employees_Subtree;
GO

CREATE FUNCTION dbo.fn_getsubtreeITVF(@empid AS INT)

RETURNS TABLE

RETURN

WITH Employees_Subtree(empid, empname, mgrid, lvl)

(

-- Anchor Member (AM)

SELECT empid, empname, mgrid, 0

FROM Employees

WHERE empid = @empid

UNION all

-- Recursive Member (RM)

SELECT e.empid, e.empname, e.mgrid, es.lvl+1

FROM Employees AS e

JOIN Employees_Subtree AS es

ON e.mgrid = es.empid

)

SELECT * FROM Employees_Subtree;

As you would expect, this function returns exactly the same result set. So, let’s look at how they perform.

Let’s get the statistics of each by running SET STATISTICS IO, TIME ON before the two queries. We’ll also grab the actual execution plans, and capture the activity with Profiler. The following statistics are returned:

MTVF:

Table '#15502E78'. Scan count 6, logical reads 6...

Table 'Departments'. Scan count 1, logical reads 2...

SQL Server Execution Times:

CPU time = 15 ms,  elapsed time = 12 ms.

Table '#15502E78'. Scan count 6, logical reads 6...

Table 'Departments'. Scan count 1, logical reads 2...

SQL Server Execution Times:

CPU time = 15 ms, elapsed time = 12 ms.

ITVF:

Table 'Worktable'. Scan count 7, logical reads 85...

Table 'Employees'. Scan count 1, logical reads 35...

Table 'Departments'. Scan count 1, logical reads 2...

SQL Server Execution Times:

CPU time = 0 ms,  elapsed time = 114 ms.

Table 'Worktable'. Scan count 7, logical reads 85...

Table 'Employees'. Scan count 1, logical reads 35...

Table 'Departments'. Scan count 1, logical reads 2...

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 114 ms.

Note that for the MTVF, the Employees table doesn’t appear to have been touched. Instead, what we are seeing is some read activity on a table variable – some very low read activity. Also, notice that the elapsed CPU time for the MTVF is substantially greater than the ITVF. (The total elapsed time is related to how long it takes to return the information to the client, so I disregard this value.) Since the function is being called 6 times, the table variable is built, populated, and then read from 6 times, hence the value of 6. But how many reads were being performed against the Employees table? We know that the function is accessing the Employees table, but we have no clue as to what the IO statistics are for that table. Instead, the statistics show only the reading of the data from the MTVF table variable, not the reading and inserting of data into it.

In looking at the statistics, it appears that the ITVF is doing a lot more work than the MTVF – but it is running considerably faster. (For this small set of data, both are extremely fast, but you can see that the ITVF is so fast that it can’t be registered at the millisecond level.)

Let’s compare this to what Profiler caught:

As you can see, they have the same CPU time that the statistics caught. But here we can see the total read activity that the MTVF is doing and that the ITVF is doing 2/3 the reads of the MTVF. Again, I’m ignoring the Duration column since it can be impacted by other activities going on across the network/client computer.

Finally, let’s look at the execution plans:

MTVF:

ITVF:

In the MTVF, you see only an operation called “Table Valued Function”. Everything that it is doing is essentially a black box – something is happening, and data gets returned. For MTVFs, SQL can’t “see” what it is that the MTVF is doing since it is being run in a separate context. What this means is that SQL has to run the MTVF as it is written, without being able to make any optimizations in the query plan to optimize it.

In the ITVF, everything that it is doing is being shown… just like a view, its activity is being “inlined” into the query plan. Since SQL now can see everything that is going on (across the entire query), it can make optimizations in the query plan to be more efficient.

This example is just on a small handful of records in both tables, but we can already see a performance difference. When you expand these tables to tens of thousands of records, this difference is really magnified. I have seen performance improvements that take a query from running in tens of minutes to seconds, simply by converting an MTVF to an ITVF.

Another thing to note in these execution plans – look at the query cost (relative to the batch) percentages. They show that the MTVF is 20%, and the ITVF is 80%. So, that portion of the execution plan is also misleading.

By the way… scalar functions work nearly essentially the same as MTVFs. You can get a similar substantial performance boost by converting these to ITVFs, or possibly by just JOINing to the table.

So, in closing, use those ITVFs. And try to keep away from MTVFs and scalar functions. Those functions are necessary sometimes, but use the ITVFs if possible… your SQL Server will thank you.

PS – another closing note. When you have MTVFs, remember that both the IO statistics and the execution plan total cost percentages are misleading.

How do different types of functions compare performance wise?

Performance, SQL, SQL TIP none

#1 | Written by Tomas about 14 years ago. Reply

Hi Wayne,
great article, thanks for it.
I would like ask you about the scalar functions.
Could you please explain little bit their performance?
I often use them to get substring from table column value, i.e. I do not call select inside. Is there better approach to transform values returned by select statements?

Loading...
- #2 | Written by Wayne Sheffield about 14 years ago. Reply
  
  Hi Tomas,
  You will get better performance by converting your scalar functions into inline table-valued functions. Paul White has an excellent blog post about it at //sqlblog.com/blogs/paul_white/archive/2012/09/05/compute-scalars-expressions-and-execution-plan-performance.aspx.
  
  Loading...
#3 | Written by Lina Sengupta about 12 years ago. Reply

Great article. Thanks

Loading...
#4 | Written by Ralph about 9 years ago. Reply

Will there be an update to this excellent post based on SQL Server 2017? //www.youtube.com/watch?time_continue=2&v=szTmo6rTUjM I am curious to see if this new edition really fixed things as the proclaim in the video.

Loading...
- #5 | Written by Wayne Sheffield about 9 years ago. Reply
  
  Hi Ralph,
  The changes in SQL 2017 don’t really improve this test (or the one in the Comparing Inline / MTVFs post at //blog.waynesheffield.com/wayne/archive/2012/02/comparing-inline-and-multistatement-table-valued-functions/)
  
  I do have a demo that show that this does help, when the improved cardinality estimates improve the query. It looks to me like this change will mostly affect queries that have other joins.
  
  Loading...
#6 | Written by Cal about 8 years ago. Reply

>So, in closing, use those ITVFs. And try to keep away from MTVFs and scalar functions.

Are you suggesting we should use ITVF and just avoid MTVFs?

>Those functions are necessary sometimes, but use the ITVFs if possible

Now I am confused because above you said to avoid them but now you are saying they are necessary sometimes. How would I know whether it is necessary and I just cannot do without it?

Loading...
- #7 | Written by Wayne Sheffield about 8 years ago. Reply
  
  Many times, you can use an ITVF instead of a MSTVF. When possible do so. I’ve seen way too many MSTVF’s that could be easily re-written as an ITVF. When rewritten, performance usually improves dramatically.
  There are some MSTVFs that just cannot be easily re-written into a ITVF. You might have to live with those. But you should try.
  
  Loading...

#1 | Pinged by SQL Advent Calendar Day 16 Snippet For In-Line Table Functions | Mickey's T-SQL Ponderings about 13 years ago.

[…] Comparing Inline and Multi-Statement Table-Valued Functions by Wayne Sheffield […]

Loading...
#2 | Pinged by Replacing Data – Part 1 | Ramblings of a Crafty DBA about 13 years ago.

[…] defined masking character. In general, in-line functions perform well. See Wayne Sheffield’s blog article on this […]

Loading...
#3 | Pinged by Replacing Data – Part 2 | Ramblings of a Crafty DBA about 13 years ago.

[…] logic into a in-line table valued functions since they perform very well. See Wayne Sheffield’s blog article that attests to this […]

Loading...
#4 | Pinged by SQL Saturday | uxcps information management blog about 12 years ago.

[…] Back in developer territory, Rob Farley suffered a gargantuan bout of demo-laptop-fail but managed to get back on course to show the evils of using table valued functions. Summary: these execute per row so are slow where-as in-line table value functions are set based and can be fully utilized by the query optimizer. See a related article here: //blog.waynesheffield.com/wayne/archive/2012/02/comparing-inline-and-multistatement-table-value… […]

Loading...
#5 | Pinged by How to: Multi-statement Table Valued Function vs Inline Table Valued Function | SevenNet about 12 years ago.

[…] His original blog post. […]

Loading...
#6 | Pinged by Is my string a number? | Ramblings of a Crafty DBA about 11 years ago.

[…] In-line table valued functions. They are faster than regular functions. See Wayne Sheffield’s blog article that attests to this […]

Loading...
#7 | Pinged by Inline vs Multi-statement Table Valued Functions – Database knowledge (MS SQL Server…) about 9 years ago.

[…] //sqlblog.com/blogs/rob_farley/archive/2011/11/08/when-is-a-sql-function-not-a-function.aspx //blog.waynesheffield.com/wayne/archive/2012/02/comparing-inline-and-multistatement-table-value… […]

Loading...

Wayne Sheffield

Recent Posts

Top Posts

Archives

Categories

Blogroll

Comparing Inline and Multi-Statement Table-Valued Functions

Related Posts:

Like this:

Related

Leave a Reply to let me know how you liked this postCancel reply

Wayne Sheffield

Recent Posts

Top Posts

Archives

Categories

Tags

Blogroll

Comparing Inline and Multi-Statement Table-Valued Functions

Related Posts:

Share this:

Like this:

Related

Leave a Reply to let me know how you liked this postCancel reply