Understanding Pivot Operator in SQL

The pivot operator in SQL Server converts each row in the aggregated result set into corresponding columns in the output set. The pivot operator is particularly useful in writing cross-tabulation queries.

Let’s take a look at how it works in practice.

Preparing the Data

First, let’s create some dummy data which we can then use to implement the pivot operator.

How Does The Pivot Operator Work?

The standard way of grouping SQL data is by using the Group By clause. Let’s create a query that calculates the average of the values in the total_score column of the student table, grouped by city.

This gives the following result:

cityAvg_Score
Leeds500
Liverpool481
London547
Manchester522

What if we want a result set where the city names are displayed in columns where each column contains the average value of the total_score of the students belonging to that city? Something like this:

LondonLiverpoolLeedsManchester
547481500522

This is where the pivot operator comes handy.

Selecting the Base Data

The first step when using the pivot operator is to select the base data that the pivot operator will be based on. We want to group our data by city and find the average of the total_score of the students who belong to that city. Therefore, we need to write a simple SELECT statement that selects the city and total_score.

Creating a Temporary Dataset

Now, ideally we would be able to directly apply the pivot operator on the base data that we created in the previous section but unfortunately, we can’t. In order for the pivot operator to work, we have to create a table-valued expression that we can apply the pivot operator to. We have a variety of choices here; we could use a derived tabled, common table expressions (CTEs) or we could even create temporary tables.

For this example, we will use a quick, simple derived table. To do that with the basic select statement that we created in the last section, we wrap it in a set of parenthesis and then apply an alias to it. Finally, we select everything from that derived table.

Applying the Pivot Operator

Now that we have prepared our base data and have created a derived table, we will apply the pivot operator to it.

To do this, insert “PIVOT” at the end of the derived table, followed by a set of parenthesis, and give this pivot table an alias.

Inside the parenthesis, we have to specify some important information.

  1. We need to specify the field we want to apply an aggregate function to. In our case, we want to apply the AVG aggregate function in the “total_score” column.
  2. We then have to say which columns from the base data we are pivoting our data onto. We do so by writing “FOR” followed by the column name which is a city in our example.
  3. The final step is a little irritating. We have to list the values from the city column that we want to become headings in our pivot table. We use the IN operator followed by a set of parenthesis. Inside the parenthesis, we use a comma-separated list where we write the name of each column inside a square bracket. In our example, we want London, Leeds, and Manchester as the heading names of the pivot table and so we write them in this format: ([London], [Leeds], [Manchester]).
If you execute the above query, the results will look like this:

LondonLiverpoolLeedsManchester
547481500522

Adding Row Groups in Pivot Table

In the previous sections, we saw how to convert row groups into column groups using pivot operator. However, you can also add row groups along with column groups into a pivot table.

For example, if you want to find the average value of the total_score column of all the students grouped by the city as well as by gender you can use the column group and the row group in conjunction inside a pivot table. Here each column will represent a city name and each row will represent a student gender.

Luckily, you don’t have to write any additional script to add row groups to a pivot table. Inside the base dataset, simply add the column name which you want to add as a row group to the pivot table.

In the above script, we simply added “gender” column in the base SELECT statement.

The output of the above query looks like this:

GenderLondonLiverpoolLeedsManchester
Female500450500NULL
Male571543NULL522

This is cross-tabulation. For instance, it can be seen from the results that the average total_score of female students living in London is 500. Similarly, average total score of male students living in London is 571.

Ben Richardson

Ben Richardson

Ben Richardson runs Acuity Training. Acuity is an IT training business offering classroom courses in London and Guildford. It is a leading provider of SQL training the UK and offers a full range of SQL training from introductory training to advanced administration courses.
Ben Richardson

Latest posts by Ben Richardson (see all)

Ben Richardson

Ben Richardson runs Acuity Training. Acuity is an IT training business offering classroom courses in London and Guildford. It is a leading provider of SQL training the UK and offers a full range of SQL training from introductory training to advanced administration courses.