Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Table valued functions vs scalar functions
Message
From
25/03/2014 22:28:11
 
 
To
All
General information
Forum:
Microsoft SQL Server
Category:
SQL syntax
Title:
Table valued functions vs scalar functions
Environment versions
SQL Server:
SQL Server 2005
Application:
Web
Miscellaneous
Thread ID:
01597365
Message ID:
01597365
Views:
84
Well, folks, this will be my last post on the UT. Yes, I'm "retiring for good" from the UT. If you ever want to get hold of me, my email address is kgoff@kevinsgoff.net

I'm scheduled to speak at the Philly VFP User Group in May - topic is T-SQL for Application Developers. Feel free to browse my site (www.KevinSGoff.net) for any new content, webcast links, etc.


Final topic....as this came up as a question.....scalar UDFs vs Table Valued UDFs in SQL Server. As a general statement, single line TVFs will usually perform better than scalar functions. Here's a basic example....

In AdventureWorks, there's an employee pay history table, that shows pay rates and effective dates (for when the payment went into effect)

Suppose I want to know, at any one time, what a person's pay rate was on a specific day.

I might have a function that does this, using the subquery pattern to grab the most recent pay rate row for the employee, "on or before" the as of date.
create function dbo.UDFScalar_GetRateForAsOfDate
(  @AsOfDate Date, @EmployeeID int)
returns Money
as
begin
declare @Rate Money
set @Rate = (select Rate from HumanResources.EmployeePayHistory 
                                         where BusinessEntityID = @EmployeeID
		 and RateChangeDate = (select max(RatechangeDate) from HumanResources.EmployeePayHistory
		                                   where BusinessEntityID = @EmployeeID
				   and RatechangeDate <= @AsOfDate)
															   )
return  @Rate
end
go
And then I'll consume it in this query, to grab it for everyone for the as of date of 1-1-2005
select BusinessEntityID, LoginID, JobTitle, 
                dbo.UDFScalar_GetRateForAsOfDate('1-1-2005', BusinessEntityID) as RateForDate from HumanResources.Employee
                       where dbo.UDFScalar_GetRateForAsOfDate('1-1-2005', BusinessEntityID) is not null
OK, that works. And it runs in less than a second...but we're not talking about much data. And there is a CPU cost in dealing with the scalar function.

As it turns out, there's a technique that in many cases (maybe not all, but many) will run a bit faster.....and gives us the flexibility of returning multiple columns.

We can write a table function instead, to bring back multiple columns (not just the rate for the effective date, but the actual most recent rate change date on or before the effective date). Now, SQL Server doesn't have a "row-based" function that returns a single row, but we can bring back the equivalent of a table variable.
create function dbo.UDFTableValued_GetRateForAsOfDate
(  @AsOfDate Date, @EmployeeID int)
returns Table
as
return 
select Rate, RateChangeDate from  HumanResources.EmployeePayHistory 
                                      where BusinessEntityID = @EmployeeID
		 and RateChangeDate = (select max(RatechangeDate) from HumanResources.EmployeePayHistory
		                                   where BusinessEntityID = @EmployeeID
  			            and RatechangeDate <= @AsOfDate)
													
end
go
Now, we consume a table valued function from the outside a bit differently. We can't join the Employee master to it....but we can use a CROSS APPLY.
select BusinessEntityID, LoginID, JobTitle, TVF.Rate, TVF.RatechangeDate  from HumanResources.Employee
   CROSS APPLY dbo.UDFTableValued_GetRateForAsOfDate ('1-1-2005', BusinessEntityID)  TVF
We don't need to filter out people with NULL rates, as the CROSS APPLY will eliminate them. (If we truly wanted to see all employees, even those without a rate for the date, we'd use OUTER APPLY instead)

This second approach, with the table function, runs in about 82 milliseconds and has no CPU cost. The first approach, using the scalar function, runs in about 109 ms and has a small CPU cost. So the table function approach is a bit faster, and you get the benefit of being able to bring back more than one column.

In more advanced "real life" applications, the differences in performance can be astounding. I've seen scalar functions that take minutes to run (when consumed over a large # of rows) whereas table functions take far less. As always, mileage can vary, so it's good to test out both to see.

I'm sure others can/will make suggestions on this example, it's simply offered as a starter example. Some people might be able to refactor the subquery and try out a SELECT TOP 1...ORDER BY instead of MAX. Sure, that's great, give it a try.

But generally, in SQL Server, be careful about using scalar UDFs over a large # of rows.

Adios...
Reply
Map
View

Click here to load this message in the networking platform