Availability Groups are fantastic for High Availability/Disaster Recovery solutions, and I’m sure that fellow DBAs will agree with me. However, there will be times when we must consider certain precautions and extra steps carefully to avoid unwanted surprises. For instance, any Secondary Replica becomes the current Primary Replica for whatever reason, and our goal is to not let it happen.
In this article, I’m going to showcase one scenario that requires the DBA to pay extra attention to detail. Just like the title says, it will guide you through the proper way to deal with the data encryption within databases that are a part of the AlwaysOn Availability Group setup.
I will be using Transparent Data Encryption (TDE) as the technology to build my case around, and thus it is important to mention that this feature is available within the following SQL Server Editions only:
- SQL Server 2019 Evaluation, Standard, Developer, Enterprise
- SQL Server 2017 Evaluation, Developer, Enterprise
- SQL Server 2016 Evaluation, Developer, Enterprise
- SQL Server 2014 Evaluation, Developer, Enterprise
- SQL Server 2012 Evaluation, Developer, Enterprise
- SQL Server 2008R2 Datacenter, Evaluation, Developer, Enterprise
- SQL Server 2008 Evaluation, Developer, Enterprise
This post will not cover the steps to build an Availability Group, and I am relying on the one already built for testing purposes.
The environment is Windows-based, but the principles will be very similar if you use different platforms (e.g. SQL Server on Linux or Azure SQL Managed Instances).
Availability Group layout for testing purposes
My test Availability Group consists of 2 replicas, each in its own VM. Here are the basic properties:
As you can see, there is nothing fancy, just a couple of replicas using synchronous commit mode and under manual failover mode.
The following screenshot demonstrates a database called “test.” It is added to the Availability Group and it is in the synchronized state.
Adding Transparent Data Encryption to the database
Here are the steps to enable Transparent Data Encryption for the “test” database. Note: we’ll execute the following steps in the current Primary Replica.
Create a master key in the master database.
USE master; GO CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<UseStrongPasswordHere>';
Create a certificate protected by the master key.
CREATE CERTIFICATE TestCertificate WITH SUBJECT = 'My test Certificate';
Create the Database Encryption Key (DEK) and protect it with the certificate created in Step 2.
USE test; GO CREATE DATABASE ENCRYPTION KEY WITH ALGORITHM = AES_256 ENCRYPTION BY SERVER CERTIFICATE TestCertificate;
Set the “test” database to use encryption.
ALTER DATABASE test SET ENCRYPTION ON;
After you’re done, you need to confirm that Transparent Data Encryption is enabled for the “test” database:
In the Database Properties section, go to the Options page. There, pay attention to the State area at the bottom of the window. The Encryption Enabled value must be True.
You can also run the following TSQL code to confirm it:
SELECT name,is_encrypted FROM sys.databases WHERE name = 'test'
Note: Don’t be surprised if you find out that tempdb is encrypted as well. It is because tempdb is where all kinds of operations take place (e.g. sorting, hash joins, etc.), using the data from all databases. Therefore, if at least one user database is encrypted, operations from that particular database might go into tempdb which will need to return that data to the user database. Therefore, sending unencrypted data back would represent the issue.
You can use the following TSQL Code to confirm that there’s a Database Master Key present for the “test” database that is encrypted by the certificate (as executed in Step 3):
SELECT DB_NAME(database_id) AS DB, create_date, key_algorithm, key_length, encryptor_thumbprint, encryptor_type FROM sys.dm_database_encryption_keys WHERE DB_NAME(database_id) = 'test'
So far so good, at least for the Primary Replica. But what happens if we query the sys.databases system view to confirm the encryption status of the “test” database in the Secondary Replica?
The Availability Group replicates everything related to the database from one replica to another. However, the Secondary Replica clearly states that it’s not encrypted.
Let’s check our Secondary Replica for any clues about that:
The state of the database is “Not Synchronizing / Suspect” – not looking good at all. However, after inspecting the Error Log of the Secondary Replica, I can see the following:
2021-04-10 00:40:55.940 spid39s Error: 33111, Severity: 16, State: 3. 2021-04-10 00:40:55.940 spid39s Cannot find server certificate with thumbprint '0xDF36E3D052086AA05BBB1C64A72A2CAB5A98F240'. 2021-04-10 00:40:55.950 spid39s Error: 33111, Severity: 16, State: 3. 2021-04-10 00:40:55.950 spid39s Cannot find server certificate with thumbprint '0xDF36E3D052086AA05BBB1C64A72A2CAB5A98F240'. 2021-04-10 00:40:55.950 spid39s Always On Availability Groups data movement for database 'test' has been suspended for the following reason: "system" (Source ID 2; Source string: 'SUSPEND_FROM_REDO'). To resume data movement on the database, you will need to resume the database manually. For information about how to resume an availability database, see SQL Server Books Online. 2021-04-10 00:40:55.950 spid39s Error: 3313, Severity: 21, State: 2. 2021-04-10 00:40:55.950 spid39s During redoing of a logged operation in database 'test', an error occurred at log record ID (34:743:1). Typically, the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup, or repair the database.
The main problem is that the certificate used to encrypt the Database Encryption Key of the “test” database (Step 3) is not present in the Secondary Replica.
Because Availability Groups do not replicate data from system databases. The server certificate missing resides in the Primary Replica master database.
Let’s generate a backup of the server certificate in the Primary Replica master database. Then let’s restore it in the master database of the Secondary Replica.
Use the following TSQL code to generate the backup from the current Primary Replica that has the “test” database with TDE enabled:
USE master; GO BACKUP CERTIFICATE TestCertificate TO FILE = 'C:\temp\TestCertificate.cer' WITH PRIVATE KEY (file='C:\temp\TestCertificate.pvk', ENCRYPTION BY PASSWORD='<YourStrongPasswordHere>');
Before attempting to restore the certificate in the Secondary Replica, first check if the Database Master Key exists within the master database. If not, create it exactly as in Step 1.
Use the following TSQL code to restore the certificate in the Secondary Replica. Note: Make sure to copy the .cer and .pvk files to the target directory first.
USE master; GO CREATE CERTIFICATE TestCertificate FROM FILE = 'C:\temp\TestCertificate.cer' WITH PRIVATE KEY ( FILE = N'C:\temp\TestCertificate.pvk', DECRYPTION BY PASSWORD = '<YourStrongPasswordHere>' );
Thus, even after restoring the certificate in the Secondary Replica, the state of the “test” database remains the same:
Since the data movement for the “test” database is paused, let’s manually resume it to see if we’re lucky this time:
Yes! The operation was successful. The “test” database is not only fully synchronized and healthy but also encrypted with TDE.
Besides, after performing the manual failover to swap the roles of the replicas, everything continues to work perfectly fine.
The showcased solution worked perfectly fine. However, it might not be ideal in all cases, especially if you are a DBA who likes to plan and do things in the “correct” way. I see “correct” as follows:
- Remove the database from the Availability Group
- Execute all the steps to enable Transparent Data Encryption in the replica where the database is read/written.
- Backup the server certificate from the Primary Replica.
- Copy the backup file to the location(s) of the Secondary Replica(s).
- Restore the certificate in the master database.
- Add the database back to the Availability Group.
- Make sure that the operational state of the database is the correct and intended one (depending on your particular setup).
I’m throwing double-quotes for “correct” because in the way I presented the solution I got the database in the Secondary Replica marked as Suspect. This alone would probably raise many unwanted flags/alerts/finger-pointing. It is unnecessary noise that you can easily avoid by taking into account all the considerations to plan and properly implement TDE in an Always On Availability Group setup.
To wrap this post, I’m leaving you with the official Encryption Hierarchy used for TDE, that Microsoft has posted on their website. What I’d like you to keep in mind is where each element is created (either in the master or user database), so that you can overcome any potential issues/surprises with Availability Groups.
In case you are not aware, SQL Complete can greatly assist you to configure Always Encrypted, which is another useful technology to protect sensitive data.
Keep in mind that you would need to consider the same that is discussed within this article if you’re planning to deal with Always Encrypted in an Always On Availability Group scenario. However, features that SQL Complete v6.7 introduces are designed to make sure that you succeed right off the bat.