SQL Server Transactional Replication is one of the most common Replication techniques used to share, copy, or distribute data to multiple destinations. In this article, we will discuss Replication, various Replication Types, and pay special attention to the Transactional Replication work.
What is SQL Transactional Replication?
Replication is the SQL Server technology for Copying or Distributing data from one database to another while maintaining Data consistency.
Replication can be used to transfer data from one database to another
- on the same instance or another instance in the same server;
- or across Servers in a single location or multiple locations.
First, we should go through the Replication Architecture and understand the Replication terminologies.
SQL Server Replication Architecture & Terminologies
- The Publisher is the Source Database instance that publishes the data changes that can be distributed to another database. Data from a Single Publisher can be sent to a Single Subscriber or multiple Subscribers.
- Subscriber is the Destination Database Instance where we distribute the data changes captured from the Publisher Database. Subscriber can be either a Publisher Instance or another instance in Publisher Server/ another Server in the same location/distant location. Before the data changes are distributed to the Subscriber Database instance, these data are stored in the Distributor.
- The Distributor is a Database that stores change logs captured from Publisher databases. When a Server is enabled as a Distributor, it will create a system Database named the distribution database.
By the location of distribution databases, they can be classified as either local or remote distributors.
– Local Distributor is a distribution database residing on the Publisher database instance.
– Remote Distributor is a distribution database residing either in the Subscriber database instance or in any other SQL Server instance apart from the Publisher database instance.
The deciding factor is where to place the Distribution database on the Publisher instance (another instance). it depends on the Server resources available to handle the Data Distribution load.
According to the way of how the data will be sent from the Distribution database to the Subscriber instance, it can be classified into either Push or Pull Subscriptions.
– Push Subscription means that the Distribution database takes responsibility to push the data to the Subscriber database instance.
– Pull Subscription means that the Subscriber database instance takes responsibility to pull the available data from the Distribution database and apply it to the Subscriber database.
- Articles are the fundamental unit of Replication. It indicates any data changes on this database object or article that will be replicated from Publisher to Subscriber. The Article can be a Table, View, Indexed View, Stored Procedure, or the User Defined Function.
- Publications are a collection of one or more Articles from the database in Publisher.
- Subscription defines what Publication will be received. Also, it determines from which Publication and on what schedule the data is replicated. Subscription can be either Push or Pull (from Publication to Subscription).
- Replication Agents are standalone programs responsible for tracking changes and distributing data across the Publisher to Distributor and Subscriber. All Replication Agents execute as Jobs under SQL Server Agent. Thus, it can be administered via SSMS under SQL Server Agent Jobs or Replication Monitor. The following types of Replication Agents are available:
- Snapshot Agent – Used by almost all types of Replication. The Snapshot Agent runs from Server holding the distribution database. It prepares the Schema and initial data of all Articles included in a Publication on a Publisher. Also, it creates the Snapshot files in a snapshot folder and records Synchronization details in the Distribution database.
- Log Reader Agent – Used by Transactional Replication. The goal is to read the data changes of articles enabled for Replication from the Publisher Database Transaction Logs and stored to the Distribution Database. the Log Reader Agent runs from the Distributor Server.
- Distribution Agent – Used by Transactional and Snapshot Replication. It applies the initial Snapshot files and incremental or available pending transactions from the Distribution database to the Subscriber database. The Distribution Agent runs from the Distributor Server for Push Subscriptions and the Subscriber Server for Pull Subscriptions.
- Merge Agent – Used by Merge Replication only. It applies the initial snapshot files and reconciliation of differential or incremental changes across the Publisher or Subscriber. The Merge Agent runs on the Distributor Server for Push Replication and from the Subscriber Server for Pull Subscriptions.
- Queue Reader Agent – Queue Reader Agent is used by Transactional Replication with queued update option. It moves changes from Subscriber to Publisher. The Queue Reader Agent runs from the Distributor Server.
- Replication Maintenance Jobs – As explained earlier, all Replication Agents are standalone programs set up while configuring Replication. They run as jobs under SQL Server Agent Jobs. Few significant jobs to be noted are Distribution Clean Up: Distribution, Agent History Clean Up: Distribution, and Expired Subscription Clean Up.
Types of Replication in SQL Server
Now when we know the terminology, let’s dive into the types of Replication.
- Transactional Replication. As the name suggests, every transaction or data change within the transactional scope on Publisher will be sent to Subscriber in near real-time with minor delays depending upon the network bandwidth and server resources. Transactional Replication uses the Log Reader Agent to read the data changes from the Transactional Logs of the Publisher database. It also uses the Distribution Agent to apply changes to Subscriber. Occasionally it may use Snapshot Agent to take initial Snapshot data of all Replicated articles. Transactional Replication Publication may fall under the below categories:
- Standard Transactional Replication – Subscriber is a read-only database from the Transactional Replication perspective. Any changes performed by anyone on the Subscriber database will not be tracked and updated to the Publisher Database. The Standard Transactional Replication is often referred to as Transactional Replication.
- Transactional Replication with Updatable Subscriptions is an Enhancement of Standard Transactional Replication that tracks the data changes taking place on Subscribes. Whenever data changes are initiated on an Updateable Subscription, they will be first propagated to Publisher and then to other Subscribers.
- Peer-to-Peer Replication is an Enhancement of the Standard Transactional Replication. It propagates transactionally consistent changes in near real-time across multiple server instances.
- Bidirectional Replication is an Enhancement of the Standard Transactional Replication that allows two servers (limit to only 2 Servers and hence named Bidirectional) to exchange the data changes across each other with any server acting as a Publisher (to send changes to another Server) or as a Subscriber (to receive changes from another server).
- Merge Replication – Supports capturing data changes that take place across both Publisher and Subscriber and distributes it to the other Server. The Merge Replication requires the ROWGUID column on the Table Articles involved in Merge Replication. It uses Triggers to capture the data changes across Publisher and Subscriber. Also, it delivers the changes across Servers when both Publisher and Subscriber are connected to the network. Merge Replication uses the Merge Agent to replicate the data changes across Publisher and Subscriber.
- Snapshot Replication – As the name indicates, Snapshot Replication doesn’t rely on either Transactional Logs or Triggers to capture the changes. It takes a snapshot of articles involved in Publication and applies it to the Subscriber with the records available at the time of the snapshot. The Snapshot Replication uses the Snapshot Agent to take a snapshot of the Publisher and uses the Distribution Agent to apply these records to Subscriber.
SQL Server Transactional Replication
Transactional Replication is typically preferred in scenarios where the OLTP Publisher database has heavy Data INSERT/UPDATE and/or DELETE activities.
Since the Publisher server instance has huge DISK IO happening, generating Reports may cause severe blockings. It also can impact the Server performance. Hence, another Server with near real-time data is good for offloading the Reporting requirements.
One of the fundamental requirements for Transactional Replication is that the tables replicated should have a Primary Key available.
We can summarize how Transactional Replication works. Have a look at the below Transactional Replication Architecture diagram taken from the official Microsoft documentation.
The Publication is created on the Publisher database that comprises the list of Articles for replicating to the Subscriber database.
Transactional Replication will typically be initialized from Publisher to Distributor via the Snapshot Agent or Full Backups. The Snapshot Agent is supported via Replication Configuration Wizard. The Full Backup is supported via TSQL Statements to initialize the Transactional Replication.
The Log Reader Agent scans the Transactional Log of the Publisher database for tracked Articles. Then it copies the data changes from the Transactional Log to the Distribution database.
The Distribution database can be either in Publisher or Subscriber; it can also be or another independent SQL Server instance.
Note also the following things:
- Log Reader Agent runs continuously from the Distributor Server to scan for new commands marked for Replication. However, if you don’t wish to run continuously and want to run on a schedule instead, we can change the Log Reader Agent SQL Job that will be created out.
- Log Reader Agent picks up all records marked for Replication from the Transactional Log in batches and sends them to the Distribution database.
- Log Reader Agent picks up only Committed transactions from the Transactional Log of Publisher Database. So, any long-running queries on the Publisher database can directly impact Replication as it is waiting for the active transaction to complete.
The Distribution Agent picks up all undistributed new commands from the Distribution database and applies them to the Subscription database via either Push or Pull Mechanism. As mentioned earlier, if Push Subscription Distributor takes ownership to apply the changes from Distribution database to Subscriber whereas in Pull Subscription Subscriber database takes ownership to fetch the changes from Distribution database to Subscriber.
Once the records are distributed successfully from the Distribution to Subscriber database, they will be marked as Distributed and marked for deletion from the Distribution database. One of the Key Replication Maintenance jobs named Distribution Clean Up: Distribution job runs once every 10 minutes to delete the distributed records from the Distribution database to maintain the Distribution database size under control.
With the detailed explanation of concepts of Replication and Transactional Replication, we can get our hands on it by configuring Replication for AdventureWorks database and verifying Replication for every component discussed theoretically.
Configuring Transactional Replication Step by Step (via SSMS GUI)
Transactional Replication configuration involves 3 major steps:
- Configuring Distribution database
- Publication Creation
- Subscription Creation
Before trying to configure Replication, make sure that Replication Components are installed as part of SQL Server installation or use SQL Server media to install Replication Components, as they are necessary for the task.
In SSMS, connect to the Publisher Database Instance and right-click on Replication:
Distribution is not configured right now. Hence, we have the Configure Distribution option. We can either Configure the Distribution database using the Configure Distribution wizard or via the Publication Creation wizard.
To configure the Distribution database and Publication, follow the below steps:
Expand Replication and right-click on New Publication.
New Publication Wizard will launch. Click Next to see the Distributor configuration options.
By default, it chooses the Publisher server to hold the distribution database. If you wish to use a remote distribution database, choose the second option. Click Next.
The next option is for configuring the Snapshot Folder. Change it to the required folder. Otherwise, it will be created on the SQL Server installation folder path by default. Click Next.
Select the Publication Database (here it is AdventureWorks) and click Next.
Choose the Publication Type – Transactional Replication. Click Next.
Choose Articles for this publication. For testing purposes, select all Tables and Views:
Before clicking on Next, expand tables once more to verify some issues.
Some tables are marked by red icons. When we click on those tables, we see the warning indicating that a table can’t be replicated because it doesn’t have a Primary Key, one of the crucial requirements for Transactional Replication. We will later go into more detail. Now, click Next.
A page with Article Issues related to dependencies will appear. Click Next.
The next option is to Filter Table Rows – since we are testing the basic replication, we can ignore it. Click Next.
Configure Snapshot Agent – ignore and click Next.
Agent Settings – click Security Settings to configure account to execute the Snapshot agent and Log Reader Agent under it.
Then, change the Snapshot Agent Process to run under SQL Server Agent Service Account.
Set the Log Reader Agent to Connect to Publisher > By impersonating the Process Account. Click OK.
The Agent Security will get updated.
Thus, we have configured the Distributor and all elements of the Publication like Articles, Snapshot Agent, Log Reader Agent, and Agent Securities. We have almost completed the Publication creation via Wizard.
If you need to study further on TSQL scripts used to create Publication, we can check the Generate a script file to create the Publication option. Otherwise, click Next.
Since I chose to save the file, the wizard allows me to set the Script file path and name. Provide these details and click Next.
The Wizard finally asks for the Publication Name, I’ve named it AdventureWorks_pub with the database name and keywords to indicate it as a publication for easier identification.
Verify all data provided on the Summary page and click Finish.
The Wizard will display the progress in Creating Publication. When it is complete, we’ll see the confirmation. Click Close.
To verify the successful creation of the Distributor (Distribution database), expand the system databases:
To verify the successful creation of Publication, expand Local Publication:
We have configured the Distribution Database and created the Publication database on the AdventureWorks database successfully. Now we can proceed with the Subscription creation
Right-click on the new Publication we have just created and select New Subscriptions:
The New Subscriptions Wizard will appear. To start the process, click Next.
The Publication page asks to ensure that both the Publication and Publisher databases are selected. Click Next.
Set the Distribution Agent to either Push or Pull Subscription. We are going to use the Publisher Server as the Subscriber, and that type won’t have any impact. Hence, we leave the default Push Subscription. Click Next.
Select the Subscribers (database). I’m selecting AdventureWorks_REPL restored from the same AdventureWorks Database backup. Click Next.
Set the Agent Security:
Since I’m going to do everything within a single Server, I’m using the Agent service account.
The next window presents the Distribution Agent Security values already configured. Click Next.
Synchronization Schedule – leave it to the default. Click Next.
Initialize Subscriptions – leave it with the default values. Click Next.
After you provide all the necessary details, you will be able to complete the process of creating the Subscription. Check the Generate Script file… option to study the scripts later and click Next.
Provide the path to save the files, click Next.
Have a look at the summary and check all the configured values. Once verified, click Finish.
The Subscription Creation is completed. Click Close.
Now we can see the Subscription displayed under our Publication.
Configure the Snapshot Agent
Our next step is to work on the Snapshot Agent to send the initial data from Publisher to Subscriber.
Before stepping into it, we need to notice the Replication Monitor. This critical tool is available in SSMS to view the Replication status at various levels, Server Level, Publisher Database level, Subscription level, and Replication Agents level.
Right-click on Replication/Local Publication/Local Subscription/Publication or the Subscription we created to launch the Replication Monitor as shown below:
In the Replication Monitor, expand Publisher Server (RRJ) > Publication ([AdventureWorks]: AdventureWorks_pub) to display the Subscription details. Right-click on Subscription and select View Details.
As we can see, the information about the Initial Snapshot for our publication AdventureWorks_pub is not yet available. We’ll need to execute the Snapshot agent job to send initial data to the Subscriber database.
Keep this window open to see the progress of Snapshot after starting the Snapshot Agent job.
Right-click on Publication > View Snapshot Agent Status:
The agent has never been run message states that we have never executed the Snapshot Agent. Click Start.
While the Snapshot Agent is executing, you can watch the progress:
When all snapshots are created, it will produce the confirmation message:
We can see the Snapshot files created newly in the Snapshot folder for which we provided the path earlier.
After all snapshots are applied by the Distribution Agent to the Subscriber database, it will display the below status in the open Replication Monitor window:
Congrats! We have successfully configured Transactional Replication using Snapshot Agent.
Note: If we have a huge Publisher Database, creating snapshot might take a lot of time. Thus, it is recommended to use the full backup of the Publisher database instead of executing the Snapshot Agent – we’ll cover this issue in subsequent articles.
Verifying Replication Components
Every Replication Components can be verified by both SSMS GUI and TSQL queries. We’ll discuss it in further articles, and here we’ll quickly explain how to view the properties of the below components.
In SSMS, right-click Replication > Publisher Properties > Publication Databases:
To view details about the Publisher, execute the below queries against the distribution database.
USE distribution GO exec sp_helpdistpublisher GO select * from MSpublisher_databases GO
Subscriber info can be obtained with the below query in SSMS.
USE distribution GO exec sp_helpsubscriberinfo GO select * from MSsubscriber_info
In SSMS, right-click Replication > Distributor Properties:
Click on Publishers to display the list of all Publishers using this Distribution database.
In SSMS, we can run the below query to obtain the same details.
USE distribution GO exec sp_helpdistributor GO exec sp_helpdistributiondb GO
Right-click on Publication > Publication Properties > Articles. You will see the list of all Articles available. The properties of Individual articles can be modified by clicking on Article Properties as well.
USE AdventureWorks GO -- To View all articles available under a Publication exec sp_helparticle @publication = 'Adventureworks_pub' GO -- To View all article columns for a particular article available under a Publication exec sp_helparticlecolumns @publication = 'Adventureworks_pub', @article = 'Address' GO USE distribution GO SELECT * from MSArticles
Right-click on Publication > Properties:
In SSMS, we can run the below query to view the Publication properties:
USE AdventureWorks GO exec sp_helppublication GO USE distribution GO SELECT * FROM MSPublications
Right-click on Subscription > Subscription properties:
In SSMS, we can execute the below script to get the Subscription info:
USE AdventureWorks GO exec sp_helpsubscription GO USE distribution GO SELECT * FROM MSsubscriptions GO
Under SQL Server Agent Jobs, we can find the specific Jobs created for all Replication Agents:
In SSMS, we can execute the query to find out which job is the necessary Log Reader Agent Job, Snapshot Agent Job, and Distribution Agent Jobs. Besides, we can see the Distribution Agent Cleanup job and several other jobs related to Replication created while we were setting Publication and Subscriptions internally.
How Log Reader Agent Works
The Log Reader Agent reads all committed data from the Publisher database transactional logs and pushes it to the Distributor database. Even though Microsoft doesn’t provide an official way to read Transactional Logs, there are few undocumented functions like fn_dblog() and fn_dump_dblog() that can read the data from Log Files. However, these functions are undocumented and not covered by Microsoft support. Thus, we won’t explore them further.
How Distribution Agent Delivers the Data Changes to the Subscriber Database
Once the data is written to the Distribution database, we can read how that data is stored in distribution tables. For that, we apply the sp_browsereplcmds procedure – it fetches the records across the MSrepl_commands and MSrepl_transactions tables.
For learning purposes, let’s take a table with 3 columns named Person.ContactType:
The created Subscription will make 3 procedures for every article that is part of Publication in Subscriber database with the below naming conventions:
For the Person.ContactType Table article, we can see the below procedures created in the Subscriber database:
- dbo.sp_MSins_PersonContactType – INSERT new records captured from the Transaction Logs of Publisher database and then propagated to the distribution database.
- dbo.sp_MSupd_PersonContactType – UPDATE changes captured from the Transaction Logs of Publisher database and then propagated to the distribution database.
- dbo.sp_MSdel_PersonContactType – DELETE records captured from Transaction Logs of Publisher database and then propagated to the distribution database.
Script of the dbo.sp_MSins_PersonContactType Procedure
As you can see, it’s a straightforward INSERT statement that comes out of the distribution database:
ALTER procedure [dbo].[sp_MSins_PersonContactType] @c1 int, @c2 nvarchar(50), @c3 datetime as begin insert into [Person].[ContactType] ( [ContactTypeID], [Name], [ModifiedDate] ) values ( @c1, @c2, @c3 ) end GO
Script of the dbo.sp_MSupd_PersonContactType Procedure
The script relies on the Primary Key values to identify the unique record for updating:
ALTER procedure [dbo].[sp_MSupd_PersonContactType] @c1 int = NULL, @c2 nvarchar(50) = NULL, @c3 datetime = NULL, @pkc1 int = NULL, @bitmap binary(1) as begin declare @primarykey_text nvarchar(100) = '' update [Person].[ContactType] set [Name] = case substring(@bitmap,1,1) & 2 when 2 then @c2 else [Name] end, [ModifiedDate] = case substring(@bitmap,1,1) & 4 when 4 then @c3 else [ModifiedDate] end where [ContactTypeID] = @pkc1 if @@rowcount = 0 if @@microsoftversion>0x07320000 Begin if exists (Select * from sys.all_parameters where object_id = OBJECT_ID('sp_MSreplraiserror') and [name] = '@param3') Begin set @primarykey_text = @primarykey_text + '[ContactTypeID] = ' + convert(nvarchar(100),@pkc1,1) exec sp_MSreplraiserror @errorid=20598, @param1=N'[Person].[ContactType]', @param2=@primarykey_text, @param3=13233 End Else exec sp_MSreplraiserror @errorid=20598 End end GO
Script of the dbo.sp_MSdel_PersonContactType Procedure
This script relies on the Primary Key values to identify a unique record for deleting records from the Subscriber:
ALTER procedure [dbo].[sp_MSdel_PersonContactType] @pkc1 int as begin declare @primarykey_text nvarchar(100) = '' delete [Person].[ContactType] where [ContactTypeID] = @pkc1 if @@rowcount = 0 if @@microsoftversion>0x07320000 Begin if exists (Select * from sys.all_parameters where object_id = OBJECT_ID('sp_MSreplraiserror') and [name] = '@param3') Begin set @primarykey_text = @primarykey_text + '[ContactTypeID] = ' + convert(nvarchar(100),@pkc1,1) exec sp_MSreplraiserror @errorid=20598, @param1=N'[Person].[ContactType]', @param2=@primarykey_text, @param3=13234 End Else exec sp_MSreplraiserror @errorid=20598 End end GO
This clearly explains why Transactional Replication enforces the Primary Key as a key requirement for tables to be added as part of Replication.
Now, let’s see the Transactional Replication in action. Let’s change some data in the Publisher database. For simplicity, I’ll take the same Person.ContactType table.
Executing the SELECT statement on the table yields 20 records:
Next, I INSERTed a sample record into the Person.ContactType table:
And, I UPDATE the newly inserted record:
DELETE the newly inserted record from the table:
We need to verify these transactions in Replication using sp_browsereplcmds
Changes from Person.ContactType were captured from the Transactional Logs of Publisher Database (AdventureWorks) and sent to the Distribution database in the same order. Later, it was propagated to the Subscriber Database (AdventureWorks_REPL).
Thanks for reading this long power-packed article! We have gone through a variety of topics, such as:
- Replication Architecture and Terminologies
- SQL Server Replication Types
- SQL Server Transactional Replication in Detail
- SQL Server Transactional Replication Configuration (Default approach)
- SQL Server Transactional Replication Verification
- SQL Server Transactional Replication in action
I hope that you’ve found lots of helpful information in this article. In subsequent parts, we’ll explore troubleshooting various issues that are frequently encountered in Replication, and learn how to handle them more efficiently.