How to Implement Table Partitioning in SQL Server

Total: 1 Average: 5

Introduction

Table partitioning is a technique used in SQL Server to physically organize the data stored in tables in different storage structures. In essence, it is splitting one large logical structure into smaller parts physically. The result is, we can improve performance for certain kinds of queries. More importantly, we can move data about using certain techniques.

CodingSight - How to Implement Partitioned Tables

Filegroups

We also need to clarify the filegroups concept because filegroups are that layer of abstraction used to separate data physically.

A filegroup is a logical construct that allows the SQL Server to see a collection of physical data files as a single logical unit. When an SQL Server writes the data into the filegroup, the data is spread across the files belonging to that filegroup. It is a File Group – a group of data files.

Tables Sit on File Groups

Whenever we create a table, it is created on a filegroup – the PRIMARY filegroup by default (if you don’t specify the filegroup at once). However, you can choose to place a table on a different filegroup if you wish.

It is different, though, when you create a partitioned table. Then it must sit on a Partition Scheme.

Partition Schemes and Functions

A Partition Scheme maps a table to a set of filegroups. A Partition Function defines the criteria by which data is distributed across the filegroups belonging to the Partition Scheme. Thus, when we create a partitioned table, we must first create a Partition Function, and then a Partition Scheme.

Why Partitioned Tables?

Partitioned tables are helpful for several reasons. First, it is performance benefits when filegroups are sitting on separate disks. Then, we work with the appropriate degree of parallelism and queries that span one partition. Another reason is the maintenance benefit, specifically data archiving achieved by switching out partitions. In the following article you can get the details on the entire process: Switching Out Table Partitions in SQL Server: A Walkthrough. Other benefits include online index rebuilds, parallel operations, piecemeal restores of filegroups, etc.

Hands-on Practice

We start by creating a regular table as shown in the following code:

-- Listing 1: CREATE TABLE Statement
use DB2
GO
CREATE TABLE memmanofarms (
fname VARCHAR(50)
,lname VARCHAR(50)
,city VARCHAR(50)
,PhoneNo bigint
,email VARCHAR(100) check (email like '%@%')
,gender char(1)
)

Since we did not specify any filegroup or partition function, the table sits in the PRIMARY filegroup:

Table memmanofarms in the PRIMARY File Group
Figure 1: Table memmanofarms in the PRIMARY File Group

Using the code in Listing 2, we populate the table with six unique records. These records are replicated a different number of times per row.

-- Listing 2: Populate Table
USE DB2
GO
INSERT INTO memmanofarms VALUES ('Kenneth','Igiri','Accra','23320055444','kenneth@kennethigiri.com','M');
GO 1100
INSERT INTO memmanofarms VALUES ('Vivian','Akeredolu','Lagos','2348020055444','vivian@gmail.com','F');
GO 720
INSERT INTO memmanofarms VALUES ('Emelia','Okoro','Port Harcourt','2348030057324','emelia@yahoo.com','F');
GO 400
INSERT INTO memmanofarms VALUES ('Uche','Igiri','Enugu','2348030057324','uche@yahoo.com','M');
GO 1000
INSERT INTO memmanofarms VALUES ('Kweku','Annan','Kumasi','23354055884','kweku@ymail.com','M');
GO 150
INSERT INTO memmanofarms VALUES ('Aisha','Bello','Kano','2347088057324','aisha@gmail.com','F');
GO 890

We can confirm the row count for each city using queries in Listing 3. It also helps us to get a baseline of the way the query is executed when our table is not partitioned.

-- Listing 3: Count Rows 

USE DB2
GO
SET STATISTICS IO ON;
SET STATISTICS TIME ON;

SELECT COUNT(*) FROM memmanofarms;
SELECT COUNT(*) FROM memmanofarms WHERE city='Accra';
SELECT COUNT(*) FROM memmanofarms WHERE city='Lagos';
SELECT COUNT(*) FROM memmanofarms WHERE city='Port Harcourt';
SELECT COUNT(*) FROM memmanofarms WHERE city='Enugu';
SELECT COUNT(*) FROM memmanofarms WHERE city='Kumasi';
SELECT COUNT(*) FROM memmanofarms WHERE city='Kano';

Taking this further, we use the code in Listing 4 to set up objects required for Table Partitioning on DB2 databases. Notice that for N partitions (and N filegroups), there will always be N-1 boundaries.

-- Listing 4: Set Up Partitioning
-- Create a Partition Function

USE [DB2]
GO
CREATE PARTITION FUNCTION
PartFunc (VARCHAR(50))
AS RANGE RIGHT
FOR VALUES 
('Accra'
,'Enugu'
,'Kano'
,'Kumasi'
,'Lagos'
,'Port Harcourt'
)
GO

-- Create File Groups

USE [master]
GO
ALTER DATABASE [DB2] ADD FILEGROUP [AC]
ALTER DATABASE [DB2] ADD FILEGROUP [EN]
ALTER DATABASE [DB2] ADD FILEGROUP [KA]
ALTER DATABASE [DB2] ADD FILEGROUP [KU]
ALTER DATABASE [DB2] ADD FILEGROUP [LA]
ALTER DATABASE [DB2] ADD FILEGROUP [PH]
ALTER DATABASE [DB2] ADD FILEGROUP [OT]
GO

-- Add Files to the File Groups
USE [master]
GO
ALTER DATABASE [DB2] ADD FILE ( NAME = N'AC01', FILENAME = N'C:\MSSQL\Data\AC01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [AC];
ALTER DATABASE [DB2] ADD FILE ( NAME = N'EN01', FILENAME = N'C:\MSSQL\Data\EN01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [EN];
ALTER DATABASE [DB2] ADD FILE ( NAME = N'KA01', FILENAME = N'C:\MSSQL\Data\KA01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [KA];
ALTER DATABASE [DB2] ADD FILE ( NAME = N'KU01', FILENAME = N'C:\MSSQL\Data\KU01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [KU];
ALTER DATABASE [DB2] ADD FILE ( NAME = N'LA01', FILENAME = N'C:\MSSQL\Data\LA01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [LA];
ALTER DATABASE [DB2] ADD FILE ( NAME = N'PH01', FILENAME = N'C:\MSSQL\Data\PH01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [PH];
ALTER DATABASE [DB2] ADD FILE ( NAME = N'OT01', FILENAME = N'C:\MSSQL\Data\OT01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [OT];
GO

-- Create a Partition Scheme
USE [DB2]
GO
CREATE PARTITION SCHEME PartSch 
AS PARTITION PartFunc TO
(
AC,
EN,
KA,
KU,
LA,
PH,
OT
)
GO

Once we have the foundation, we can move our regular table from the PRIMARY filegroup to the Partition Function we created. We do this by rebuilding the clustered index as shown in Listing 5.

Notice that the column we used to create the partition column must be a part of the clustered index column listing. Also, when we specify the Partition Scheme, we must indicate this column – city.

-- Listing 5: Move table to New Partition
CREATE CLUSTERED INDEX [ClusteredIndexCity] ON [dbo].[memmanofarms]
(
	[city] ASC,
	[PhoneNo] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) 
ON [PartSch](city);
GO
‘

After running this code, the table sits on the Partition Scheme:

Table memmanofarms on the PartSch Partition Scheme
Figure 2: Table memmanofarms on the PartSch Partition Scheme

A Little on Performance

When we run both questions with the partitioned and non-partitioned tables, we see little difference in performance without the index. In both cases, SQL Server uses an index seek when the clustered index is in place and a full table scan otherwise. However, we have more reads when running these queries on the partitioned table. This is expected since we have in effect forced distribution of the data across “distant” pages.

We are using the following code in Listing 6 as a benchmark of sorts.

-- Listing 6: Querying Partitioned and Non-Partitioned Tables
SELECT COUNT(*) FROM memmanofarms WHERE city='Accra';
SELECT * FROM memmanofarms WHERE city='Accra';
Execution Plan – COUNT(*) without Partitioning
Figure 3: Execution Plan – COUNT(*) without Partitioning
Execution Plan - Result Set without Partitioning
Figure 4: Execution Plan – Result Set without Partitioning
Execution Plan - Result Set without Partitioning, Without Index
Figure 5: Execution Plan – Result Set without Partitioning, Without Index
Regular Table with Index
Figure 6: Regular Table with Index
Regular Table without Index
Figure 7: Regular Table without Index
Execution Plan – COUNT(*) with Partitioning
Figure 8: Execution Plan – COUNT(*) with Partitioning
Execution Plan - ResultSet with Partitioning
Figure 9: Execution Plan – ResultSet with Partitioning
STATISTICS - Partitioned Table with Index
Figure 10: STATISTICS – Partitioned Table with Index
STATISTICS - Partitioned Table without Index
Figure 11: STATISTICS – Partitioned Table without Index

If you are interested in obtaining more information from the Query Plan Diagram, you may refer to its more advanced version available in dbForge Studio for SQL Server – it also includes the Query Profiler functionality for your needs.

Conclusion

We have explored the process of creating a partitioned table. The references section lists more resources related to the use of partitioning for archiving old data. We have also shown that partitioning does not necessarily introduce a significant improvement in performance without other enhancements such as the right CPU and proper MAXDOP configuration.

References

  1. Partitioned Tables and Indexes
  2. CHECK Constraints on SQL Server

Kenneth Igiri

Kenneth Igiri

Kenneth Igiri is an Enterprise Architect with eProcess International S.A., Ecobank Group's Shared Services Centre. Kenneth has over fifteen years' experience in Information Technology with nine of those years focused on SQL Server and Oracle databases. His interests include Database Performance, HADR, Cloud Computing, Data Architecture and Enterprise Architecture. Asides from work, Kenneth teaches at Children's Church, writes faith-based fiction and helps small businesses grow. You can connect with Kenneth via his blog https://kennethigiri.com, LinkedIn, or on Amazon.com.