Properly Dealing with Encryption of Databases in an AlwaysOn Availability Group Scenario

Total: 1 Average: 5

Availability Groups are fantastic for High Availability/Disaster Recovery solutions, and I’m sure that fellow DBAs will agree with me. However, there will be times when we must consider certain precautions and extra steps carefully to avoid unwanted surprises. For instance, any Secondary Replica becomes the current Primary Replica for whatever reason, and our goal is to not let it happen.

In this article, I’m going to showcase one scenario that requires the DBA to pay extra attention to detail. Just like the title says, it will guide you through the proper way to deal with the data encryption within databases that are a part of the AlwaysOn Availability Group setup.

CodingSight-Dealing with encryption of databases in an AlwaysOn availability group scenario

Initial considerations

I will be using Transparent Data Encryption (TDE) as the technology to build my case around, and thus it is important to mention that this feature is available within the following SQL Server Editions only:

  • SQL Server 2019 Evaluation, Standard, Developer, Enterprise 
  • SQL Server 2017 Evaluation, Developer, Enterprise 
  • SQL Server 2016 Evaluation, Developer, Enterprise 
  • SQL Server 2014 Evaluation, Developer, Enterprise 
  • SQL Server 2012 Evaluation, Developer, Enterprise
  • SQL Server 2008R2 Datacenter, Evaluation, Developer, Enterprise
  • SQL Server 2008 Evaluation, Developer, Enterprise

This post will not cover the steps to build an Availability Group, and I am relying on the one already built for testing purposes.

The environment is Windows-based, but the principles will be very similar if you use different platforms (e.g. SQL Server on Linux or Azure SQL Managed Instances).

Availability Group layout for testing purposes

My test Availability Group consists of 2 replicas, each in its own VM. Here are the basic properties:

Availability Group Properties

As you can see, there is nothing fancy, just a couple of replicas using synchronous commit mode and under manual failover mode.

Preview in new tab(opens in a new tab)

The following screenshot demonstrates a database called “test.” It is added to the Availability Group and it is in the synchronized state.

database called “test.”

Adding Transparent Data Encryption to the database

Here are the steps to enable Transparent Data Encryption for the “test” database. Note: we’ll execute the following steps in the current Primary Replica.

Step 1

Create a master key in the master database.

USE master;
GO
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<UseStrongPasswordHere>';

Step 2

Create a certificate protected by the master key.

CREATE CERTIFICATE TestCertificate WITH SUBJECT = 'My test Certificate';

Step 3

Create the Database Encryption Key (DEK) and protect it with the certificate created in Step 2.

USE test;
GO
CREATE DATABASE ENCRYPTION KEY
WITH ALGORITHM = AES_256
ENCRYPTION BY SERVER CERTIFICATE TestCertificate;

Step 4

Set the “test” database to use encryption.

ALTER DATABASE test
SET ENCRYPTION ON;

After you’re done, you need to confirm that Transparent Data Encryption is enabled for the “test” database:

In the Database Properties section, go to the Options page. There, pay attention to the State area at the bottom of the window. The Encryption Enabled value must be True.

Database Properties - test

You can also run the following TSQL code to confirm it:

SELECT name,is_encrypted
FROM sys.databases
WHERE name = 'test'

Note: Don’t be surprised if you find out that tempdb is encrypted as well. It is because tempdb is where all kinds of operations take place (e.g. sorting, hash joins, etc.), using the data from all databases. Therefore, if at least one user database is encrypted, operations from that particular database might go into tempdb which will need to return that data to the user database. Therefore, sending unencrypted data back would represent the issue.

You can use the following TSQL Code to confirm that there’s a Database Master Key present for the “test” database that is encrypted by the certificate (as executed in Step 3):

SELECT 
	DB_NAME(database_id) AS DB,
	create_date,
	key_algorithm,
	key_length,
	encryptor_thumbprint,
	encryptor_type
FROM sys.dm_database_encryption_keys
WHERE DB_NAME(database_id) = 'test'

So far so good, at least for the Primary Replica. But what happens if we query the sys.databases system view to confirm the encryption status of the “test” database in the Secondary Replica?

confirm the encryption status of the “test”

The Availability Group replicates everything related to the database from one replica to another. However, the Secondary Replica clearly states that it’s not encrypted.

Let’s check our Secondary Replica for any clues about that:

Secondary Replica of a database

The state of the database is “Not Synchronizing / Suspect” – not looking good at all. However, after inspecting the Error Log of the Secondary Replica, I can see the following:

2021-04-10 00:40:55.940	spid39s	Error: 33111, Severity: 16, State: 3.
2021-04-10 00:40:55.940	spid39s	Cannot find server certificate with thumbprint '0xDF36E3D052086AA05BBB1C64A72A2CAB5A98F240'.
2021-04-10 00:40:55.950	spid39s	Error: 33111, Severity: 16, State: 3.
2021-04-10 00:40:55.950	spid39s	Cannot find server certificate with thumbprint '0xDF36E3D052086AA05BBB1C64A72A2CAB5A98F240'.
2021-04-10 00:40:55.950	spid39s	Always On Availability Groups data movement for database 'test' has been suspended for the following reason: "system" (Source ID 2; Source string: 'SUSPEND_FROM_REDO'). To resume data movement on the database, you will need to resume the database manually. For information about how to resume an availability database, see SQL Server Books Online.
2021-04-10 00:40:55.950	spid39s	Error: 3313, Severity: 21, State: 2.
2021-04-10 00:40:55.950	spid39s	During redoing of a logged operation in database 'test', an error occurred at log record ID (34:743:1). Typically, the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup, or repair the database.  

The main problem is that the certificate used to encrypt the Database Encryption Key of the “test” database (Step 3) is not present in the Secondary Replica.

Why?

Because Availability Groups do not replicate data from system databases. The server certificate missing resides in the Primary Replica master database.

Solution

Let’s generate a backup of the server certificate in the Primary Replica master database. Then let’s restore it in the master database of the Secondary Replica.

Use the following TSQL code to generate the backup from the current Primary Replica that has the “test” database with TDE enabled:

USE master;
GO
BACKUP CERTIFICATE TestCertificate
TO FILE = 'C:\temp\TestCertificate.cer'                                                          
WITH PRIVATE KEY (file='C:\temp\TestCertificate.pvk',
ENCRYPTION BY PASSWORD='<YourStrongPasswordHere>');

Before attempting to restore the certificate in the Secondary Replica, first check if the Database Master Key exists within the master database. If not, create it exactly as in Step 1.

Use the following TSQL code to restore the certificate in the Secondary Replica. Note: Make sure to copy the .cer and .pvk files to the target directory first.

USE master;
GO
CREATE CERTIFICATE TestCertificate
  FROM FILE = 'C:\temp\TestCertificate.cer'
  WITH PRIVATE KEY ( 
    FILE = N'C:\temp\TestCertificate.pvk',
 DECRYPTION BY PASSWORD = '<YourStrongPasswordHere>'
  );

Thus, even after restoring the certificate in the Secondary Replica, the state of the “test” database remains the same:

State of the “test” database

Since the data movement for the “test” database is paused, let’s manually resume it to see if we’re lucky this time:

Data movement for the "test" database is paused

Yes! The operation was successful. The “test” database is not only fully synchronized and healthy but also encrypted with TDE.

“test” database is encrypted with TDE

Besides, after performing the manual failover to swap the roles of the replicas, everything continues to work perfectly fine.

Backup of the server certificate in the Primary Replica master database

Conclusion

The showcased solution worked perfectly fine. However, it might not be ideal in all cases, especially if you are a DBA who likes to plan and do things in the “correct” way. I see “correct” as follows:

  • Remove the database from the Availability Group
  • Execute all the steps to enable Transparent Data Encryption in the replica where the database is read/written.
  • Backup the server certificate from the Primary Replica.
  • Copy the backup file to the location(s) of the Secondary Replica(s).
  • Restore the certificate in the master database.
  • Add the database back to the Availability Group.
  • Make sure that the operational state of the database is the correct and intended one (depending on your particular setup).

I’m throwing double-quotes for “correct” because in the way I presented the solution I got the database in the Secondary Replica marked as Suspect. This alone would probably raise many unwanted flags/alerts/finger-pointing. It is unnecessary noise that you can easily avoid by taking into account all the considerations to plan and properly implement TDE in an Always On Availability Group setup. 

To wrap this post, I’m leaving you with the official Encryption Hierarchy used for TDE, that Microsoft has posted on their website. What I’d like you to keep in mind is where each element is created (either in the master or user database), so that you can overcome any potential issues/surprises with Availability Groups.

Transparent Database Encryption Architecture

In case you are not aware, SQL Complete can greatly assist you to configure Always Encrypted, which is another useful technology to protect sensitive data.

Keep in mind that you would need to consider the same that is discussed within this article if you’re planning to deal with Always Encrypted in an Always On Availability Group scenario. However, features that SQL Complete v6.7 introduces are designed to make sure that you succeed right off the bat.

Alejandro Cobar

Alejandro Cobar is a multi-platform DBA (with a stronger focus in SQL Server), with a passion to automate stuff as much as possible to make things work for the greater good. He began his professional journey as a developer (a bit more than 10 years ago) and has been going back and forth with DBA roles as well, being the latter the one he has mostly developed throughout the years. He also has several Microsoft SQL Server Certifications that have helped him go even further on his skills as a SQL Server Database Administrator.