This article is about T-SQL (Transact-SQL) Window functions and their basic use in day-to-day data analysis tasks.

There are many alternatives to T-SQL when it comes to data analysis. However, when improvements over time and introduction of Window functions are considered, T-SQL is capable of performing data analysis on a basic level and, in some cases, even beyond that.

## About SQL Window Functions

First, let’s first get familiar with SQL Window functions in the context of the Microsoft documentation.

### Microsoft Definition

A window function computes a value for each row in the window.

### Simple Definition

A window function helps us to focus on a particular portion (window) of the result set so that we can perform data analysis on that specific part (window) only, rather than on the entire result set.

In other words, SQL window functions turn a result set into several smaller sets for data analysis purposes.

### What is a result set

Simply put, a result set consists of all records retrieved by running a SQL query.

For example, we can create a table named **Product** and insert the following data into it:

-- (1) Create the Product table CREATE TABLE [dbo].[Product] ( [ProductId] INT NOT NULL PRIMARY KEY, [Name] VARCHAR(40) NOT NULL, [Region] VARCHAR(40) NOT NULL ) -- (2) Populate the Product table INSERT INTO Product (ProductId,Name,Region) VALUES (1,'Laptop','UK'),(2,'PC','UAE'),(3,'iPad','UK')

Now, the result set retrieved by using the script below will contain all rows from the **Product** table:

-- (3) Result set SELECT [ProductId], [Name],[Region] FROM Product

### What is a Window

It is important to first understand the concept of a window as it relates to SQL window functions. In this context, a window is just a way of narrowing down your scope by targeting a specific part of the result set (as we already mentioned above).

You may be wondering now – what does ‘targeting a specific part of the result set’ actually means?

Returning to the example we looked at, we can create a SQL window based on the product region by dividing the result set into two windows.

### Understanding Row_Number()

To proceed, we will need to use the **Row_Number() function **which temporarily gives a sequence number to the output rows.

For example, if we want to add a row numbers to the result set based on **ProductID,** we’ll need to use **ROW_NUMBER()** to order it by Product ID as follows:

--Using the row_number() function to order the result set by ProductID SELECT ProductID,ROW_NUMBER() OVER (ORDER BY ProductID) AS SrNo,Name,Region FROM Product

Now, if we want the **Row_Number() function **to order the result set by **ProductID **descending, then the sequence of output rows based on **ProductID** will change as follows:

--Using the row_number() function to order the result set by ProductID descending SELECT ProductID,ROW_NUMBER() OVER (ORDER BY ProductID DESC) AS SrNo,Name,Region FROM Product

There are no SQL windows yet since the only thing we’ve done is ordering the set by specific criteria. As discussed earlier, windowing means breaking the result set into several smaller sets to analyze each one of them separately.

### Creating a Window with Row_Number()

To create a SQL window in our result set, we will need to to partition it based on any of the columns it contains.

We can now partition the result set by region as follows:

--Creating a SQL window based on Region SELECT ROW_NUMBER() OVER (Partition by region ORDER BY Region) as Region_Serial_Number , Name, Region FROM dbo.Product

### Select – Over Clause

In other words, **Select** with the **Over** clause paves the way for SQL window functions by partitioning a result set into smaller windows.

According to the Microsoft documentation, **Select** with the **Over **clause defines a window which can then be used by any window function.

Now, let’s create a table called **KitchenProduct** as follows:

CREATE TABLE [dbo].[KitchenProduct] ( [KitchenProductId] INT NOT NULL PRIMARY KEY IDENTITY(1,1), [Name] VARCHAR(40) NOT NULL, [Country] VARCHAR(40) NOT NULL, [Quantity] INT NOT NULL, [Price] DECIMAL(10,2) NOT NULL ); GO INSERT INTO dbo.KitchenProduct (Name, Country, Quantity, Price) VALUES ('Kettle','Germany',10,15.00) ,('Kettle','UK',20,12.00) ,('Toaster', 'France',10,10.00) ,('Toaster','UAE',10,12.00) ,('Kitchen Clock','UK',50,20.00) ,('Kitchen Clock','UAE',35,15.00)

Now, let’s view the table:

SELECT [KitchenProductId], [Name], [Country], [Quantity], [Price] FROM dbo.KitchenProduct

If you want to see each product with its own serial number rather than a number based on the generalized product ID, then you would have to use a SQL window function to partition the result set by product as follows:

-- Viewing each product in its own series SELECT ROW_NUMBER() OVER (Partition by Name order by Name) Product_SrNo,Name,Country,Quantity FROM dbo.KitchenProduct

### Compatibility (Select – Over Clause)

**According to Microsoft documentation**, Select – Over Clause is compatible with the following SQL database versions:

- SQL Server 2008 and upward
- Azure SQL Database
- Azure SQL Data Warehouse
- Parallel Data Warehouse

### Syntax

**SELECT – OVER (Partition by <column> Order by <column>)**

*Please note that I have simplified the syntax to make i**t** easy to understand; please refer to** **the** **Microsoft documentation to see the **full** syntax.*

## Pre-requisites

This article is basically written for beginners, but there are still some pre-requisites which must be kept in mind.

### Familiarity with T-SQL

This article assumes that the readers have a basic knowledge of T-SQL and are capable of writing and running basic SQL scripts.

### Setup the Sales sample table

This article requires the following sample table so that we can run our SQL window function examples:

-- (1) Create the Sales sample table CREATE TABLE [dbo].[Sales] ( [SalesId] INT NOT NULL IDENTITY(1,1), [Product] VARCHAR(40) NOT NULL, [Date] DATETIME2, [Revenue] DECIMAL(10,2), CONSTRAINT [PK_Sales] PRIMARY KEY ([SalesId]) ); GO -- (2) Populating the Sales sample table SET IDENTITY_INSERT [dbo].[Sales] ON INSERT INTO [dbo].[Sales] ([SalesId], [Product], [Date], [Revenue]) VALUES (1, N'Laptop', N'2017-01-01 00:00:00', CAST(200.00 AS Decimal(10, 2))) INSERT INTO [dbo].[Sales] ([SalesId], [Product], [Date], [Revenue]) VALUES (2, N'PC', N'2017-01-01 00:00:00', CAST(100.00 AS Decimal(10, 2))) INSERT INTO [dbo].[Sales] ([SalesId], [Product], [Date], [Revenue]) VALUES (3, N'Mobile Phone', N'2018-01-01 00:00:00', CAST(300.00 AS Decimal(10, 2))) INSERT INTO [dbo].[Sales] ([SalesId], [Product], [Date], [Revenue]) VALUES (4, N'Accessories', N'2018-01-01 00:00:00', CAST(150.00 AS Decimal(10, 2))) INSERT INTO [dbo].[Sales] ([SalesId], [Product], [Date], [Revenue]) VALUES (5, N'iPad', N'2019-01-01 00:00:00', CAST(300.00 AS Decimal(10, 2))) INSERT INTO [dbo].[Sales] ([SalesId], [Product], [Date], [Revenue]) VALUES (6, N'PC', N'2019-01-01 00:00:00', CAST(200.00 AS Decimal(10, 2))) INSERT INTO [dbo].[Sales] ([SalesId], [Product], [Date], [Revenue]) VALUES (7, N'Laptop', N'2019-01-01 00:00:00', CAST(300.00 AS Decimal(10, 2))) SET IDENTITY_INSERT [dbo].[Sales] OFF

View all sales by running the following script:

-- View sales SELECT [SalesId],[Product],[Date],[Revenue] FROM dbo.Sales

## Group By vs SQL Window Functions

One can wonder – what is the difference between using the Group By clause and SQL window functions?

Well, the answer lies in the examples below.

### Group By Example

In order to see total sales by product, we can use Group By as follows:

-- Total sales by product using Group By SELECT Product ,SUM(REVENUE) AS Total_Sales FROM dbo.Sales GROUP BY Product ORDER BY Product

So, the Group By clause helps us see the total sales. The total sales value is the sum of revenue for all similar products in the same row with no Group By clause used. What if we are interested in seeing the revenue (sale) of each individual product along with total sales?

This is where SQL window functions come into action.

### SQL Window Function Example

In order to see the product, revenue and total revenue by all similar products, we have to partition the data on a by-product basis using **OVER()** as follows:

-- Total sales by product using an SQL window function SELECT Product ,REVENUE ,SUM(REVENUE) OVER (PARTITION BY PRODUCT) AS Total_Sales FROM dbo.Sales

The output should be as follows:

So, we can now easily see the sales for each individual product along with total sales for that product. For example, the revenue for *PC* is 100.00 but total sales (sum of revenue for the *PC *product) is 300.00 because two different PC models were being sold.

### Basic Analysis with the Aggregate Functions

Aggregate functions return a single value after performing calculations on a set of data.

In this section, we are going to further explore SQL window functions – specifically, by using them along with aggregate functions to perform basic data analysis.

### Common Aggregate Functions

The most common aggregate functions are:

- Sum
- Count
- Min
- Max
- Avg (Average)

#### Aggregate Data Analysis by Product

In order to analyse the result set on a by-product basis with the help of aggregate functions, we simply have to use an aggregate function with a by-product partition inside of the OVER() statement:

-- Data analysis by product using aggregate functions SELECT Product,Revenue ,SUM(REVENUE) OVER (PARTITION BY PRODUCT) as Total_Sales ,MIN(REVENUE) OVER (PARTITION BY PRODUCT) as Minimum_Sales ,MAX(REVENUE) OVER (PARTITION BY PRODUCT) as Maximum_Sales ,AVG(REVENUE) OVER (PARTITION BY PRODUCT) as Average_Sales FROM dbo.Sales

If you take a closer look at the **PC **or** Laptop **products**, **you will see how aggregate functions are working together alongside the SQL window function.

In the example above, we can see that the Revenue value for **PC** is 100.00 first time and 200.00 next time, but Total Sales amount to 300.00. The similar information can be seen for the rest of the aggregate functions.

#### Aggregate Data Analysis by Date

Now, let’s perform some data analysis of the products on a by-date basis using SQL window functions in combination with aggregate functions.

This time, we are going to partition the result set by date rather than by product as follows:

-- Data analysis by date using aggregate functions SELECT Product,date,Revenue ,SUM(REVENUE) OVER (PARTITION BY DATE) as Total_Sales ,MIN(REVENUE) OVER (PARTITION BY DATE) as Minimum_Sales ,MAX(REVENUE) OVER (PARTITION BY DATE) as Maximum_Sales ,AVG(REVENUE) OVER (PARTITION BY DATE) as Average_Sales FROM dbo.Sales

With this, we have learned basic data analysis techniques using the SQL window functions approach.

## Things to do

Now that you are familiar with SQL window functions, please try the following:

- Keeping in mind the examples we looked at, perform basic data analysis using SQL window functions on the sample database mentioned in this
__article.__ - Adding a Customer column to the Sales sample table and see how rich your data analysis can become when another column (customer) is added to it.
- Adding a Region column to the Sales sample table and perform basic data analysis using aggregate functions by region.