Fixing Data Loss Using Log Shipping with Delayed Recovery

Total: 8 Average: 3.8


Transaction Log Shipping is a very well-known technology used in SQL Server to maintain a copy of the live database in the Disaster Recovery Site. The technology depends on three key jobs: the Backup Job, the Copy Job, and the Restore Job. While the Backup job runs on the Primary Server, the Copy and Restore jobs run on the Secondary Server. Essentially the process involves periodic transaction log backups to a share from which the Copy Job moves same to the Secondary Server; subsequently, the Restore Job applies the log backups to the secondary server. Before all this starts, the Secondary Database must be initialized with a full backup from the Primary server restored with NORECOVERY option.

Microsoft provides a set of stored procedure that can be used to configure Log Shipping end to end as well as GUI equivalents starting from the properties item of each database you may want to configure Log Shipping for. It is worth noting that the Secondary Database can be configured in NORECOVERY mode or in STANDBY mode. In NORECOVERY mode the database is never ever available for queries but in STANDBY mode, the Secondary database can be queried when no Transaction log restore operation is ongoing.

Setting Up the Environment

To get the ball rolling, we create two SQL Server instances on AWS with an identical Amazon EC2 image. This Amazon EC2 instance is running SQL Server 2017 RTM-CU5 on Windows Server 2016. Then we restore a copy of the WideWorldImporters database using a backup set acquired from GitHub to the first instance, our Primary Instance. We use the same backup set to create two identical databases named BranchDB and CorporateDB.

fixingdata pic1

Fig. 1 SQL Server Version


fixingdata pic2

Fig. 2 BranchDB and CorporateDB on Primary Instance (Secondary Instance Blank)


Listing 1: Restoring WideWorldImporters Sample Database

We now have two instances, the Primary Instance hosting the two Primary databases (BranchDB and CorporateDB and the Secondary instance with no user databases. We proceed with configuring Transaction Log Shipping on both databases but differentiate them by applying a delay to the restore configuration of the first database. Recall that the databases are actually identical in terms of the data they contain. The following graphics show the key options selected in the Log Shipping Configuration.

fixingdata pic3

Fig. 3 Backup Settings for BranchDB


fixingdata pic4

Fig. 4 Copy Settings for BranchDB


fixingdata pic5

Fig. 5 Restore Settings for BranchDB

Each Log Shipping job is configured to run every five minutes. To process “Delay Restoring Backups”, we must use the Standby Recovery mode in the Log Shipping configuration. It is logical since it has the Secondary Database in standby mode and indicates that we can query the Secondary Database whenever a Transaction Log Restore is not ongoing. The value we specify in this option (30 minutes in this case) gives us a good window during which we can run reports off the Secondary Database apart from the core requirement of this article which is being able to recover from user error.

Also, we should mention that the restore of transaction log backups actually is being delayed. Its timestamp is later than the delay value. This means that all transaction log backups will be copied to the secondary server, which is based on the schedule and specified in the Copy Job. In fact, the Restore Job will still run on schedule but transaction log backups (that are not up to 30 minutes old) will not be restored. In essence, the BranchDB Standby database is 30 minutes behind the BranchDB primary database. To demonstrate this lag, in the next section, we shall create a table in both databases and create a job which inserts a record every minute. We will examine this table in the Secondary Databases.

The Settings for the CorporateDB Database are the same as in Figs. 3 to 5, except for the Restore Job which is NOT set to delay transaction log backups.

fixingdata pic6

Fig. 6 Restore Settings for CorporateDB


Verifying the Configuration

Once the configuration is done, we can verify that the configuration is OK and start with observing of its work. The Transaction Log Shipping Report shows us that the Branch DB is indeed lagging the CorporateDB in terms of restores:

fixingdata pic7

Fig. 7a Transaction Log Shipping Report on Primary Server


fixingdata pic8

Fig. 7b Transaction Log Shipping Report on Secondary Server

In addition, you will notice the message below in the Restore Job history for the BranchDB:


fixingdata pic9

Fig. 8 Skipped Transaction Log Restores on Secondary Server


We can go further with this verification by creating a table and using a job to populate this table with rows every minute. The job is a simple way of simulating what an application might be doing to a user table. This can show us the that this lag is definitely shown in user data.


Listing 2 – Create Log Tracker Table

Listing 3 – Create Job to Populate Log Tracker Table


When we query the table on the Primary Databases respectively, we can confirm (by using the RecordTime column) that the rows match in BranchDB and CorporateDB. When we examine the table in the Secondary Databases, in the same way, we see clearly that we do have a 30-minute gap between BranchDB and CorporateDB.

Listing 4 – Querying the Log Tracker Table


fixingdata pic10

Fig. 9 Log Tracker Tables Match in Primary Databases


fixingdata pic11

Fig. 10 Log Tracker Tables Have a ~30-Minutes Gap in Secondary Databases


Recovering from User Error

Now let’s talk about the key benefit of this delay. In the scenario, where a user inadvertently drops a table, we can recover the data quickly from the Secondary Database as long as the Delay period has not elapsed. In this example, we drop the table Sales.Orderlines on BOTH databases and verify that the table no longer exists in BOTH databases.

Listing 5 – Dropping Orderlines Table


fixingdata pic12

Fig. 11 Dropping Table Sales.Orderlines


When we look for the table on the Secondary Server, we find that the table is still available in BOTH databases. Thus, for CorporateDB we have less than five minutes to recover the data. (Fig. 12). But once the next restores Cycle executes, we lose the table in the Corporate DB database. To recover this table, we need to do point-in-time recover using a full backup in a separate environment and then extract this specific table. You will agree that it will take some time. For the BranchDB Orderlines table, we have a little more time and we can recover the table with a single SQL Statement over a Linked Server (see Listing 6).

fixingdata pic13

Fig. 12 Five Minutes Countdown: Table Exists in Both Secondary Databases


fixingdata pic14

Fig. 13 Additional 25 Minutes to Recover the BranchDB Table


Listing 6 – Recover Orderlines Table


fixingdata pic15

Fig. 14 Recover the BranchDB Sales.Orderlines Table

Then we verify the Primary Server (BranchDB Database) that the table is restored.

fixingdata pic16

Fig. 15 Recover the BranchDB Sales.Orderlines Table


SQL Server provides a number of way of recovering from data loss from a variety of root causes – disk failure, corruption, user error etc. Point-in-time recovery from backups is probably the most well-known of these methods. For certain simple cases of user error or the similar case, where one or two objects are lost, usage of Transaction Log Shipping with Delayed Recovery is a good approach to consider. However, it should be noted that a secondary database, which is configured strictly for DR needs, has to be selected for lower RPOs.

Kenneth Igiri

Kenneth Igiri is a Database Administrator with eProcess International S.A., Ecobank Group's Shared Services Centre. Kenneth has over eight years' experience with SQL Server and Oracle databases as well as related technologies. His interests include database performance, HADR, and recently, Cloud.
Also, Kenneth teaches at Children's Church and writes fiction. You can connect with Kenneth via his blog or social network account.
Kenneth Igiri

Latest posts by Kenneth Igiri (see all)