The article introduces the basics of dynamic data masking (DDM) in SQL Server along with its overview supported by a simple example of data masking implementation. Additionally, the readers are going to get familiar with the benefits of dynamic data masking. This paper also highlights the importance of data masking in day to day database development tasks when some fields must be masked due to their sensitive nature in order to comply with standard practices.
About Data Masking
Let us go through the basic concepts of data masking and the availability of this feature in SQL Server.
Simple Definition
Data masking is a method of hiding data completely or partially thereby making it hard to recognize or understand the data after the masking has been applied.
Microsoft Definition
According to Microsoft documentation, Dynamic data masking (DDM) limits sensitive data exposure by masking it to non-privileged users.
What is Sensitive data
By sensitive data, we mean any data which contains private, personally identifiable, financial or security information which if revealed can be misused or damage the reputation of an organization.
Sensitive Data Example
A good example of sensitive data is a debit card number stored in a database that must be protected from any unauthorized use. Another good example of sensitive data is the personal email address which can easily identify a person.
Non-privileged users
Any database user who is not allowed to see the sensitive data is considered to be a non-privileged user.
Dynamic Data Masking (DDM)
The data masking feature supported by SQL Server is known as dynamic data masking also referenced as DDM in Microsoft documentation. In other words, Microsoft refers to data masking as dynamic data masking in SQL Server.
Compatibility
According to Microsoft documentation, the dynamic data masking feature is supported by the following versions of SQL Server:
1. SQL Server 2016 and later versions
2. Azure SQL Database
3. Azure SQL Data Warehouse
So if you have not started using Azure SQL database(s) yet, then you need minimum SQL Server 2016 to use dynamic data masking feature.
Configuring Dynamic Data Masking
Dynamic data masking can be configured by simply using T-SQL commands.
The dynamic data masking implementation is done through T-SQL scripting to prevent unauthorized users from viewing sensitive data.
Benefits of Dynamic Data Masking
Finally, let us go through some key benefits of dynamic data masking but before that, I would like to ask the SQL beginners not the SQL professionals, what are the benefits of integrity constraints?
Let us consider an example of a unique key constraint that ensures that the column on which it is applied has distinct (no duplicate) values. If I can enforce distinct column values at the moment of data entry in my front-end application why should I bother to enforce it through a database by applying a unique key constraint?
The answer is to ensure the rules (integrity constraints) remain consistent and are controlled centrally. I have to do that at a database level otherwise I may have to write the code to enforce unique values in all the present and any upcoming applications that access the database.
The same is true for dynamic data masking since it defines the mask on a column at a database level so there is no need to perform any additional masking (code) by the applications accessing the database.
Dynamic data masking has the following benefits over traditional approaches:
1. Dynamic data masking implements the centralised policy of hiding or changing the sensitive data in a database that is inherited by any application wishes to access the data.
2. Dynamic data masking in SQL Server can help manage users with privileges to see the sensitive data and those users who are not authorized to see it.
3. It has a simple implementation in the form of T-SQL script.
Implementing Dynamic Data Masking
Before we implement dynamic data masking we need to understand the types of dynamic data masking that can be applied to a column in a table of a SQL or Azure SQL Database.
Types of Data Masks
There are four types of data masks that we can apply to a column:
1. Default Data Mask(s)
2. Partial Data Mask(s)
3. Random Data Mask(s)
4. Custom Data Mask(s)
In this article, we are going to focus on the default data masking type.
Applying Dynamic Data Masks
Dynamic data masks can be applied to a column of a table in the following ways:
1. When creating a new table
2. Altering an already created table to apply data masking to its column(s)
Setup Sample Database
Let us create a sample database named SQLDevBlogV5 by running the following T-SQL script:
-- Create sample database (SQLDevBlogV5) CREATE DATABASE SQLDevBlogV5; GO USE SQLDevBlogV5; -- (1) Create Article table in the sample database CREATE TABLE Article ( ArticleId INT PRIMARY KEY IDENTITY (1, 1) ,Category VARCHAR(50) ,Author VARCHAR(50) ,Title VARCHAR(150) ,Published DATETIME2 ,Notes VARCHAR(400) ) GO -- (2) Populating Article table SET IDENTITY_INSERT [dbo].[Article] ON INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (1, N'Development', N'Atif', N'Introduction to T-SQL Programming ', N'2017-01-01 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (2, N'Testing', N'Peter', N'Database Unit Testing Fundamentals', N'2017-01-10 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (3, N'DLM', N'Sadaf', N'Database Lifecycle Management for beginners', N'2017-01-20 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (4, N'Development', N'Peter', N'Common Table Expressions (CTE)', N'2017-02-10 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (5, N'Testing', N'Sadaf', N'Manual Testing vs. Automated Testing', N'2017-03-20 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (6, N'Testing', N'Atif', N'Beyond Database Unit Testing', N'2017-11-10 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (7, N'Testing', N'Sadaf', N'Cross Database Unit Testing', N'2017-12-20 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (8, N'Development', N'Peter', N'SQLCMD - A Handy Utitliy for Developers', N'2018-01-10 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (9, N'Testing', N'Sadaf', N'Scripting and Testing Database for beginners ', N'2018-02-15 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (10, N'Development', N'Atif', N'Advanced Database Development Methods', N'2018-07-10 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (11, N'Testing', N'Sadaf', N'How to Write Unit Tests for your Database', N'2018-11-10 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (12, N'Development', N'Peter', N'Database Development using Modern Tools', N'2018-12-10 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (13, N'DLM', N'Atif', N'Designing, Developing and Deploying Database ', N'2019-01-01 00:00:00', NULL) INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (14, N'DLM', N'Peter', N'How to Apply Database Lifecycle Management ', N'2019-02-10 00:00:00', NULL) SET IDENTITY_INSERT [dbo].[Article] OFF
Use the following query to view the articles:
-- View articles SELECT [a].[ArticleId] ,[a].[Category] ,[a].[Author] ,[a].[Title] ,[a].[Published] ,[a].[Notes] FROM dbo.Article A
Business Requirement to Mask Authors Names
Now suppose you receive a business requirement which states that author names should be masked due to the sensitivity of this information. The best way to meet this business requirement is to mask the Name column using DDM.
Masking Authors Names
We are going to alter the table in order to add data masking function as follows:
-- Masking Author column ALTER TABLE Article ALTER COLUMN [Author] varchar(50) MASKED WITH (FUNCTION = 'default()');
Checking Masking Status
You can check dynamic data masking status anytime by using the following T-SQL script given in Microsoft documentation:
-- Checking dynamic data masking status SELECT c.name, tbl.name as table_name, c.is_masked, c.masking_function FROM sys.masked_columns AS c JOIN sys.tables AS tbl ON c.[object_id] = tbl.[object_id] WHERE is_masked = 1;
The output shows us which columns have been masked successfully:
Data Check
Now check the data by querying the top 5 records of the table where we have applied the masking:
-- View top 5 artices records SELECT TOP 5 [a].[ArticleId] ,[a].[Category] ,[a].[Author] ,[a].[Title] ,[a].[Published] ,[a].[Notes] FROM dbo.Article A ORDER BY a.ArticleId
The output does not seem to be showing us the expected result:
As you can see despite the fact that we have masked the Author column it is still showing up its actual values. The reason behind this behavior is that the account we have used to apply dynamic data masking has got elevated privileges and that’s why the masked data is visible in its original form when we query the table using the current account.
The solution is to create a new user with Select permission.
Creating a user with Select permission on the table
Let us now create a new database user without a login having only Select permission on the Article table as follows:
-- Create ArticleUser to have Select access to Article table CREATE USER ArticleUser WITHOUT LOGIN; GRANT SELECT ON Article TO ArticleUser;
Viewing Top 5 Articles as ArticleUser
Next, execute Select statement to get top 5 articles using the newly created user ArticleUser with select permission only:
-- Execute SELECT Article as ArtilceUser EXECUTE AS USER = 'ArticleUser'; -- View artices SELECT TOP 5 [a].[ArticleId] ,[a].[Category] ,[a].[Author] ,[a].[Title] ,[a].[Published] FROM dbo.Article A ORDER BY a.ArticleId -- Revert the User back to what user it was before REVERT;
Congratulations! You have successfully masked the Author column as per requirement.
Dropping Masked Column
You can drop the dynamic data masking on the column you’ve applied it before by simply issuing the following T-SQL command:
-- Removing dynamic data masking on Author column ALTER TABLE Article ALTER COLUMN Author DROP MASKED;
Please stay in touch since a more advanced use of dynamic data masking is on its way in the next article.
Things to do
Now that you can mask columns of a table in a database please try the following things to improve your skills further:
1. Please try to mask the Category column in the sample database.
2. Please try to create Author table with AuthorId, Name and Email columns and then pass AuthorId as a foreign key in the Article table and then apply dynamic data masking on the Name and Email columns of the Author table by creating a test user
3. Please try to create and drop dynamic data masking to ensure that you can successfully add and remove dynamic data masking on a SQL table