Written by 10:35 Database administration, Database Optimization & Structure, SQL Server, Statements

User-Defined Functions in SQL Server

CodingSight - SQL Server User-defined Functions

User-defined functions in SQL Server (UDFs) are key objects that each developer should be aware of. Although they are very useful in many scenarios (WHERE clauses, computed columns, and check constraints), they still have some limitations and bad practices that can cause performance issues. Multi-statement UDFs might incur significant performance impacts, and this article will specifically discuss these scenarios.

The functions are not implemented in the same way as in object-oriented languages, although inline table-valued functions can be used in scenarios when you need parameterized views, this does not apply to the functions that return scalars or tables. These functions need to be carefully used since they can cause a lot of performance problems. However, they are essential in many cases, so we will need to pay more attention to their implementations. Functions are used in the SQL statements inside batches, procedures, triggers or views, inside ad-hoc SQL queries, or as a part of reporting queries generated by tools like PowerBI or Tableau, in calculated fields, and check constraints. While scalar functions can be recursive up to 32 levels, table functions don’t support recursion.

Types of User-Defined Functions in SQL Server

In SQL Server, we have three function types: user-defined scalar functions (SFs) that return a single scalar value, user-defined table-valued functions (TVFs) that return a table, and inline table-valued functions (ITVFs) that have no function body. Table Functions can be Inline or Multi-statement. Inline functions do not have return variables, they just return value functions. Multi-statement functions are contained in BEGIN-END blocks of code and can have multiple T-SQL statements that do not create any side effects (such as modifying content in a table).

We will show each type of function in a simple example:

/**
inline table function
**/
CREATE FUNCTION dbo.fnInline( @P1 INT, @P2 VARCHAR(50) )
RETURNS TABLE
AS
RETURN ( SELECT @P1 AS P_OUT_1, @P2 AS P_OUT_2 )





/**
multi-statement table function
**/
CREATE FUNCTION dbo.fnMultiTable(  @P1 INT, @P2 VARCHAR(50)  )
RETURNS @r_table TABLE ( OUT_1 INT, OUT_2 VARCHAR(50) )
AS
  BEGIN
    INSERT @r_table SELECT @P1, @P2;
    RETURN;
  END;

/**
scalar function
**/
CREATE FUNCTION dbo.fnScalar(  @P1 INT, @P2 INT  )
RETURNS INT
AS
BEGIN
    RETURN @P1 + @P2
END

User-Defined Functions Limitations

As mentioned in the introduction, there are some limitations in function usage, and I will explore just a few below. A complete list can be found at Microsoft Docs:

  • There is no concept of temporary functions
  • You cannot create a function in another database, but, depending on your privileges, you can access it
  • With UDFs, you are not allowed to perform any actions that change the database state,
  • Inside UDF, you cannot call a procedure, except the extended stored procedure
  • UDF cannot return a result set, but only a table data type
  • You cannot use dynamic SQL or temporary tables in UDFs
  • UDFs are limited in error handling capabilities – they don’t support RAISERROR nor TRY…CATCH and you cannot get data from the system @ERROR variable

What Is Allowed in Multi-Statement Functions?

Only the following things are allowed:

  • Assignment statements
  • All flow control statements, except the TRY…CATCH block
  • DECLARE calls, used to create local variables and cursors
  • You can use SELECT queries that have lists with expressions and assign these values to locally declared variables
  • Cursors can reference only local tables and have to be opened and closed inside the function body. FETCH can only assign or change values of local variables, not retrieve or change database data

What Should Be Avoided in Multi-Statement Functions, Although Allowed?

  • You should avoid scenarios where you are using computed columns with scalar functions – this will cause index rebuilds and slow updates that require recalculations
  • Consider that any multi-statement function has its execution plan and performance impact
  • Multi-statement table-valued UDF, if used in SQL expression or join statement will be slow because of the non-optimal execution plan
  • Don’t use scalar functions in WHERE statements and ON clauses unless you are sure it will query a small dataset, and that dataset will remain small in the future

Names and Parameters of User-Defined Functions

Like any other object name, function names have to comply with rules for identifiers and must be unique within their schema. If you are making scalar functions, you can run them by using the EXECUTE statement. In this case, you don’t have to put the schema name in the function name. See the example of the EXECUTE function call below (we create a function that returns the occurrence of Nth day in a month and then retrieves this data):

CREATE FUNCTION dbo.fnGetDayofWeekInMonth 
(
  @YearInput          VARCHAR(50),
  @MonthInput       VARCHAR(50), -- English months ( 'Jan', 'Feb', ... )
  @WeekDayInput VARCHAR(50)='Mon', -- Mon, Tue, Wed, Thu, Fri, Sat, Sun
  @CountN INT=1 -- 1 for the first date, 2 for the second occurrence, 3 for the third
 ) 
  RETURNS DATETIME  
  AS
  BEGIN
  RETURN DATEADD(MONTH, DATEDIFF(MONTH, 0, 
          CONVERT(DATE,'1 '+@MonthInput+' '+@YearInput,113)), 0)+ (7*@CountN)-1
          -
          (DATEPART (WEEKDAY, DATEADD(MONTH, DATEDIFF(MONTH, 0, 
                         CONVERT(DATE,'1 '+@MonthInput+' '+@YearInput,113)), 0))
          +@@DateFirst+(CHARINDEX(@WeekDayInput,'FriThuWedTueMonSunSat')-1)/3)%7
  END        


-- In SQL Server 2012 and later versions, you can use the EXECUTE command or the SELECT command to run a UDF, or use a standard approach
DECLARE @ret DateTime
EXEC @ret = fnGetDayofWeekInMonth '2020', 'Jan', 'Mon',2
SELECT @ret AS Third_Monday_In_January_2020

 SELECT dbo.fnGetDayofWeekInMonth('2020', 'Jan', DEFAULT, DEFAULT) 
               AS 'Using default',
               dbo.fnGetDayofWeekInMonth('2020', 'Jan', 'Mon', 2) AS 'explicit'

We can define defaults for function parameters, they have to be prefixed with “@” and compliant with identifier naming rules. Parameters can be only constant values, they cannot be used in SQL queries instead of tables, views, columns, or other database objects, and values cannot be expressions, even deterministic ones. All data types are allowed, except for the TIMESTAMP data type, and no non-scalar data types can be used, except for table-valued parameters. In “standard” function calls, you have to specify the DEFAULT attribute if you wish to give the end-user the ability to make a parameter optional. In new versions, using the EXECUTE syntax, this is no longer required, you just don’t enter this parameter in the function call. If we are using custom table types, they have to be marked as READONLY, meaning we cannot change the initial value inside the function, but they can be used in calculations and definitions of other parameters.

User-Defined Functions Performance

The last topic we’ll cover in this article, using functions from the previous chapter, is function performance. We will extend this function and monitor execution times and the quality of execution plans. We start by creating other function versions, and continue with their comparison:

CREATE FUNCTION dbo.fnGetDayofWeekInMonthBound 
(
  @YearInput    VARCHAR(50),
  @MonthInput   VARCHAR(50), -- English months ( 'Jan', 'Feb', ... )
  @WeekDayInput VARCHAR(50)='Mon', -- Mon, Tue, Wed, Thu, Fri, Sat, Sun
  @CountN INT=1 -- 1 for the first date, 2 for the second occurrence, 3 for the third
  ) 
  RETURNS DATETIME
  WITH SCHEMABINDING
  AS
  BEGIN
  RETURN DATEADD(MONTH, DATEDIFF(MONTH, 0, CONVERT(DATE,'1 '+@MonthInput+' '+@YearInput,113)), 0)+ (7*@CountN)-1
          -(DATEPART (WEEKDAY, DATEADD(MONTH, DATEDIFF(MONTH, 0, CONVERT(DATE,'1 '+@MonthInput+' '+@YearInput,113)), 0))
          +@@DateFirst+(CHARINDEX(@WeekDayInput,'FriThuWedTueMonSunSat')-1)/3)%7
  END        
GO

CREATE FUNCTION dbo.fnNthDayOfWeekOfMonthInline (
  @YearInput    VARCHAR(50),
  @MonthInput   VARCHAR(50), -- English months ( 'Jan', 'Feb', ... )
  @WeekDayInput VARCHAR(50)='Mon', -- Mon, Tue, Wed, Thu, Fri, Sat, Sun
  @CountN INT=1 -- 1 for the first date, 2 for the second occurence, 3 for the third
  ) 
  RETURNS TABLE
  WITH SCHEMABINDING
  AS
  RETURN (SELECT DATEADD(MONTH, DATEDIFF(MONTH, 0, CONVERT(DATE,'1 '+@MonthInput+' '+@YearInput,113)), 0)+ (7*@CountN)-1
          -(DATEPART (WEEKDAY, DATEADD(MONTH, DATEDIFF(MONTH, 0, CONVERT(DATE,'1 '+@MonthInput+' '+@YearInput,113)), 0))
          +@@DateFirst+(CHARINDEX(@WeekDayInput,'FriThuWedTueMonSunSat')-1)/3)%7 AS TheDate)
GO

CREATE FUNCTION dbo.fnNthDayOfWeekOfMonthTVF (
  @YearInput    VARCHAR(50),
  @MonthInput   VARCHAR(50), -- English months ( 'Jan', 'Feb', ... )
  @WeekDayInput VARCHAR(50)='Mon', -- Mon, Tue, Wed, Thu, Fri, Sat, Sun
  @CountN INT=1 -- 1 for the first date, 2 for the second occurence, 3 for the third
  ) 
  RETURNS @When TABLE (TheDate DATETIME)
  WITH schemabinding
  AS
  Begin
  INSERT INTO @When(TheDate) 
    SELECT DATEADD(MONTH, DATEDIFF(MONTH, 0, CONVERT(DATE,'1 '+@MonthInput+' '+@YearInput,113)), 0)+ (7*@CountN)-1
          -(DATEPART (WEEKDAY, DATEADD(MONTH, DATEDIFF(MONTH, 0, CONVERT(DATE,'1 '+@MonthInput+' '+@YearInput,113)), 0))
          +@@DateFirst+(CHARINDEX(@WeekDayInput,'FriThuWedTueMonSunSat')-1)/3)%7
  RETURN
  end   
  GO

Create some test calls and test cases

We start with table versions:

SELECT * FROM dbo.fnNthDayOfWeekOfMonthTVF('2020','Feb','Tue',2)

SELECT TheYear, CONVERT(NCHAR(11),(SELECT TheDate FROM    dbo.fnNthDayOfWeekOfMonthTVF(TheYear,'Feb','Tue',2)),113) FROM (VALUES ('2014'),('2015'),('2016'),('2017'),('2018'),('2019'),('2020'),('2021'))years(TheYear)
 
SELECT TheYear, CONVERT(NCHAR(11),TheDate,113)  FROM (VALUES ('2014'),('2015'),('2016'),('2017'),('2018'),('2019'),('2020'),('2021'))years(TheYear)
  OUTER apply dbo.fnNthDayOfWeekOfMonthTVF(TheYear,'Feb','Tue',2)

Creating test data:

IF EXISTS(SELECT * FROM tempdb.sys.tables WHERE name LIKE '#DataForTest%')
  DROP TABLE #DataForTest
GO
SELECT * 
INTO #DataForTest
 FROM (VALUES ('2014'),('2015'),('2016'),('2017'),('2018'),('2019'),('2020'),('2021'))years(TheYear)
  CROSS join (VALUES ('jan'),('feb'),('mar'),('apr'),('may'),('jun'),('jul'),('aug'),('sep'),('oct'),('nov'),('dec'))months(Themonth)
  CROSS join (VALUES ('Mon'),('Tue'),('Wed'),('Thu'),('Fri'),('Sat'),('Sun'))day(TheDay)
  CROSS join (VALUES (1),(2),(3),(4))nth(nth)

Test performance:

DECLARE @TableLog TABLE (OrderVal INT IDENTITY(1,1), Reason VARCHAR(500), TimeOfEvent DATETIME2 DEFAULT GETDATE())

Start of timing:

INSERT INTO @TableLog(Reason) SELECT 'Starting My_Section_of_code' --place at the start

First, we use no type of function to get a baseline:

SELECT DATEADD(MONTH, DATEDIFF(MONTH, 0, CONVERT(DATE,'1 '+TheMonth+' '+TheYear,113)), 0)+ (7*Nth)-1
          -(DATEPART (WEEKDAY, DATEADD(MONTH, DATEDIFF(MONTH, 0, CONVERT(DATE,'1 '+TheMonth+' '+TheYear,113)), 0))
		  +@@DateFirst+(CHARINDEX(TheDay,'FriThuWedTueMonSunSat')-1)/3)%7 AS TheDate
  INTO #Test0
  FROM #DataForTest
INSERT INTO @TableLog(Reason) SELECT 'Using the code entirely unwrapped';

We now use an inline Table-valued function cross-applied:

SELECT TheYear, CONVERT(NCHAR(11),TheDate,113) AS itsdate
 INTO #Test1
  FROM #DataForTest
    CROSS APPLY dbo.fnNthDayOfWeekOfMonthTVF(TheYear,TheMonth,TheDay,nth)
INSERT INTO @TableLog(Reason) SELECT 'Inline function cross apply'

We use an inline Table-valued function cross-applied:

SELECT TheYear, CONVERT(NCHAR(11),(SELECT TheDate FROM dbo.fnNthDayOfWeekOfMonthTVF(TheYear,TheMonth,TheDay,nth)),113) AS itsDate
  INTO #Test2
  FROM #DataForTest
 INSERT INTO @TableLog(Reason) SELECT 'Inline function Derived table'

To compare untrusted, we use a scalar function with schemabinding:

SELECT TheYear, CONVERT(NCHAR(11), dbo.fnGetDayofWeekInMonthBound(TheYear,TheMonth,TheDay,nth))itsdate
  INTO #Test3
  FROM #DataForTest
INSERT INTO @TableLog(Reason) SELECT 'Trusted (Schemabound) scalar function'
 

Next, we use a scalar function without schema binding:

SELECT TheYear, CONVERT(NCHAR(11), dbo.fnGetDayofWeekInMonth(TheYear,TheMonth,TheDay,nth))itsdate
  INTO #Test6
  FROM #DataForTest
INSERT INTO @TableLog(Reason) SELECT 'Untrusted scalar function'

Then, the multi-statement table function derived:

SELECT TheYear, CONVERT(NCHAR(11),(SELECT TheDate FROM dbo.fnNthDayOfWeekOfMonthTVF(TheYear,TheMonth,TheDay,nth)),113) AS itsdate
  INTO #Test4
  FROM #DataForTest 
INSERT INTO @TableLog(Reason) SELECT 'multi-statement table function derived'

Finally, the multi-statement table cross-applied:

SELECT TheYear, CONVERT(NCHAR(11),TheDate,113) AS itsdate
  INTO #Test5
  FROM #DataForTest
    CROSS APPLY dbo.fnNthDayOfWeekOfMonthTVF(TheYear,TheMonth,TheDay,nth)
INSERT INTO @TableLog(Reason) SELECT 'multi-statement cross APPLY'--where the routine you want to time ends

List out all the timings:

SELECT ending.Reason AS Test, DateDiff(ms, starting.TimeOfEvent,ending.TimeOfEvent) [AS Time (ms)] FROM @TableLog starting
INNER JOIN @TableLog ending ON ending.OrderVal=starting.OrderVal+1

 
DROP table #Test0
DROP table #Test1
DROP table #Test2
DROP table #Test3
DROP table #Test4
DROP table #Test5
DROP table #Test6
DROP TABLE #DataForTest
using the code entirely unwrapped

The above table clearly shows that you should consider performance vs functionality when you are using user-defined functions.

Conclusion

Functions are liked by many developers, mostly because they are “logical constructs”. You can easily create test cases, they are deterministic and encapsulating, they integrate nicely with the SQL code flow and allow flexibility in parameterization. They are a good choice when you need to implement complex logic that needs to be done on a smaller or already filtered dataset that you will need to re-use in multiple scenarios. Inline table views can be used in views that need parameters, especially from upper layers (client-facing applications). On the other hand, scalar functions are great to work with XML or other hierarchical formats, since they can be called recursively.

User-defined multi-statement functions are a great addition to your development tool stack, but you have to understand how they work and what their limitations and performance challenges are. Their wrong use can destroy the performance of any database but if you know how to use these functions, they can bring a lot of benefits to code reuse and encapsulation.

Tags: , Last modified: March 31, 2023
Close