Switching Out Table Partitions in SQL Server: A Walkthrough

Introduction

A few years ago we were tasked with a business requirement for card data in a specific format for the purpose of something called “reconciliation”. The idea was to present the data in a table to an application which would consume and process the data which would have a retention period of six months. We had to create a new database for this business need and then create the core table as a partitioned table. The process described here is the process we use to ensure that data older than six months is moved out of the table in a clean fashion.

A Bit About Partitioning

Table Partitioning is a database technology that allows you to store data belonging to one logical unit (the table) as a set of partitions which will sit on separate physical structure – data files – through an abstraction layer called File Groups in SQL Server. The process of creating this Partitioned Table involves two key objects:

A Partition Function: A Partition Function defines how the rows of a partitioned table are mapped based on the values of a specified column (the Partition Column). A partitioned table could be based on either a list or a range. For the purpose of our use case (preserving only six months’ worth of data), we used a Range Partition. A Partition Function can be defined as either RANGE RIGHT or RANGE LEFT. We used RANGE RIGHT as shown in the code in Listing 1 meaning the boundary value will belong to the right side of the boundary value interval when the values are sorted in ascending order from left to right.

A Partition Scheme: A partition Scheme is based on the Partition Function and determines on which physical structures rows belonging to each partition will be placed. This is achieved by mapping such rows to filegroups. Listing 2 shows the code for creating a Partition Scheme. Prior to creating the Partition Scheme, the filegroups to which it will refer must exist.

Notice that for N partitions, there will always be N-1 boundaries. Care must be taken when defining the first Filegroup in the Partition Scheme. The first boundary listed in the Partition Function will lie between the first and second Filegroups thus this boundary value (20190201) will sit in the second partition (FEB). In addition, it is actually possible to place all partition in a single filegroup but we have chosen separate filegroups in this case.

Getting Our Hands Dirty

So let’s dive into the task of switching out partitions!

The first thing we need to do is to determine exactly how our data is distributed among the partitions so we can know which partition we would like to switch out. Typically we will switch out the oldest partition.

Fig. 1 Output of Listing 3

Fig. 1 shows us the output of the query in Listing 3. The oldest partition is Partition 2 which contains rows from the year 2017. We verify this with the query in Listing 4. Listing 4 also shows us which Filegroup holds the data in Partition 2.

Fig. 1 Output of Listing 3
Fig. 2 Output of Listing 4

Listing 4 shows us that the filegroup associated with Partition 2 is NOV. In order to switch out Partition 2, we need a history table which is a replica of the live table but sits on the same filegroup as the partition we intend to switch out. Since we already have this table, all we need so is recreated it on the desired Filegroup. You also need to recreate the clustered index. Take note that this clustered index has the same definition as the clustered index on the table post_tran_tab and also sits on the same filegroup as post_tran_tab_hist table.

Switching out the last partition is now a one-line command. Taking a count of both tables before and after executing this one-line command will give an assurance that we have all the data desired.

Fig. 3 Table post_tran_tab_hist sits on the NOV Filegroup
Since we have switched out the last partition, we no longer need the boundary. We merge the two ranges previously split by that boundary using the command in Listing 7. We further truncate the history table as shown in Listing 8. We are doing this because this is the whole point: removing old data that we no longer need.

Fig. 4 Boundary Merged

Fig. 5 Row Count for Both Tables before Truncate

Take note that the number of rows in the history table is exactly the same as the number of rows previously in Partition 2 as shown in Fig. 1. You can also go the extra mile by recovering the empty space in the filegroup belonging to the last partition. This will be useful if since you need this space for the new data that will sit on the earlier partition. This step may not be necessary if you feel you have ample space in your environment.

Fig. 7 File to Filegroup Mappings

Fig. 8 Free Space on Operating System

Conclusion

In this article, we have done a walkthrough of the process to switch out partitions from a partitioned table. This is a very efficient way to manage data growth natively in SQL Server. More advanced technologies such as Stretch Database are available in current versions of SQL Server.

References

Isakov, V. (2018). Exam Ref 70-764 Administering a SQL Database Infrastructure. Pearson Education

Partitioned Tables and Indexes in SQL Server

Kenneth Igiri

Kenneth Igiri is a Database Administrator with eProcess International S.A., Ecobank Group's Shared Services Centre. Kenneth has over eight years' experience with SQL Server and Oracle databases as well as related technologies. His interests include database performance, HADR, and recently, Cloud.
Also, Kenneth teaches at Children's Church and writes fiction. You can connect with Kenneth via his blog or social network account.
Kenneth Igiri