Written by 18:44 Database administration, Tables, Work with data

Performing Data Changes Audit Using Temporal Table

SQL Server 2016 has introduced a feature called ‘System versioned temporal table’. Using normal table, you can retrieve current data; while using a system-versioned temporal table, you can retrieve data which was deleted or updated in the past. To do that, a temporal table will create a history table. The history table will store old data with “start_time” and “end_time”. Which indicates a time period for which the record was active.

Example: If you update a product price from 30 to 50 by querying a normal table, you can retrieve the updated product price which is 50. Using a temporal table, you can retrieve the old value which is 30.

Using temporal tables, one can perform:

  1. Track history of a record: we can review a value of the specific record, which has been changed over the time.
  2. Record-Level recovery: if we deleted a specific record from the table or a record is corrupted, we can retrieve it from the history table.

Temporal tables capture date-time of a record based on the physical dates (Calendar date) of the record update and delete. Currently, it does not support versioning based on the logical dates. For example, if you update the product name using the UPDATE statement at 1:00 PM, than temporal table will maintain the history of the name of the product until 1:00 PM. After that, a new name will be applicable. However, what if the product name change was meant to start from 2:00 PM? This means you must update the statement on time perfectly to make it work and you should have executed the UPDATE statement at 2:00 PM instead of 1:00 PM.

Temporal tables have the following prerequisites:

  1. A primary key must be defined.
  2. Two columns must be defined to record the start time and end time with the datetime2 datatype. These columns are called SYSTEM_TIME columns.

They also have some limitations:

  1. INSTEAD OF triggers and In-Memory OLTP are not allowed.
  2. History tables cannot have any constraint.
  3. Data in history table cannot be modified.

Creating a system-versioned table

The following script will be used to create a simple system-versioned table:

Use DemoDatabase
Go
CREATE TABLE dbo.Prodcuts
	(
	      Product_ID int identity (1,1) primary key
	    , Product_Name varchar (500)
	    , Product_Cost int
	    , Quantity int
	    , Product_Valid_From datetime2 GENERATED ALWAYS AS ROW START NOT NULL
	    , Product_Valid_TO datetime2 GENERATED ALWAYS AS ROW END NOT NULL
	    , PERIOD FOR SYSTEM_TIME (Product_Valid_From,Product_Valid_TO)
	)
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE =dbo.Product_Change_History));

In the above script, I have defined HISTORY_TABLE named dbo. Product_Change_History. If you do not specify a name for the history table, SQL Server will automatically create a history table with following structure.

Dbo.MSSQL_TemporalHistoryFor_xxx, where xxx is the object ID.

The temporal table will look as it is shown in the screenshot below:

How will period columns be updated when executing DML statement on Temporal Table?

Whenever we execute insert, update and delete a query on the temporal table, the period columns (SysStartDate and SysEndDate) will be updated.

Insert Query

When we perform the INSERT operation on the temporal table, the system sets the value of the SysStartTime column to the start time of the current transaction and mark the row as open.

Let’s insert some rows in the  ‘Products’ table and review how data is stored in this table.

INSERT INTO prodcuts 
            (product_name, 
             product_cost, 
             quantity) 
VALUES      ( 'Mouse', 
              500, 
              10 ), 
            ( 'Key-Board', 
              200, 
              5 ), 
            ( 'Headset', 
              500, 
              1 ), 
            ( 'Laptop', 
              50000, 
              1 )
 select * from Prodcuts

As shown in the above screenshot,  the value of the ‘Product_Valid_From’ column is ‘2018-04-02 06:55:04.4865670’ which is the row insert date. And the value of the ‘Product_Valid_To’ column is ‘9999-12-31 23:59:59.9999999’, which indicates that the row is open.

Update Query

When we execute any update query on the temporal table, the system will store the previous row values in the history table and set the current transaction time as EndTime and update the current table with a new value. SysStartTime will be the start time of transaction and SysEndTime will be the maximum of 9999-12-31.

Let’s change Product cost of ‘Mouse’ from 500 to 250. We will check the output of ‘Product’.

Begin tran UpdatePrice
Update Prodcuts set Product_cost=200 where Product_name='Mouse'
Commit tran UpdatePrice

select * from Prodcuts where Product_name='Mouse'

As you can see in the above screenshot, a value of the ‘Product_Valid_From’ column has been changed. The new value is the current transaction time (UTC). And the value of the ‘Product_Valid_To’ column is ‘9999-12-31 23:59:59.9999999’, which indicates that the row is open and has updated the price.

Let’s observe the output of the Product_change_history table by querying it.

select * from Product_Change_History where Product_name='Mouse'

As you can see in the above screenshot, a row has been added in Product_change_history table, which has an old version of the row. Value of ‘Product_cost’ is 500, Value of ‘Product_valid_From’ is the time when the record was inserted and value of Product_Valid_To column is when the value of Product_cost column was updated. This row version is considered as closed.

Delete Query

When we delete a record from the temporal table, the system will store the current version of the row in the history table and set the current transaction time as EndTime and delete the record from the current table.

Let’s delete the record of ‘Headset’.

Begin tran DeletePrice
    delete from Prodcuts where product_name='Headset'
Commit tran DeletePrice

Let’s observe the output of Product_change_history table by querying it.

select * from Product_Change_History where Product_name='Headset'

As you can see in the above screenshot, a row has been added in Product_change_history table, which was deleted from the current table. Value of ‘Product_valid_From’ is the time when the record was inserted and value of the Product_Valid_To column is the time when the row was deleted which indicates that the row version is closed.

Auditing data changes for a specific time

To audit the data changes for a specific table, we should perform the time-based analysis of temporal tables. To do that, we must use the ‘FOR SYSTEM_TIME’ clause with below temporal-specific sub-clauses to the query data across the current and history tables. Let me explain the output of queries using different sub-clauses. Below is the setup:

  1. I inserted a product named ‘Flat Washer 8’ with List price 0.00 in the temporal table at 09:02:25 AM.
  2. I changed the List Price at 10:13:56 AM. New Price is 500.00.

AS OF <Date_Time>

This clause will be used to retrieve the state of the records for a given time in the AS OF sub-clause. To understand it, let’s execute several queries:

First, we will execute a query using the AS OF clause with ”SystemTime =10:15:59”.

select Name, ListPrice,rowguid,Product_Valid_From,Product_Valid_TO from DemoDatabase.dbo.tblProduct  FOR system_time as of '2018-04-20 10:15:56
where name ='Flat Washer 8'

Now as you can see in the above screenshot, the query returned one row with the updated value of “ListPrice” and value of Product_Valid_To is the maximum of date.

Let’s execute another query using the AS OF clause with “SystemTime =09:10:56:”.

Now as you can see in the above screenshot, the value of “ListPrice” is 0.00.

From <Start_date_Time> To <End_Date_Time>

This clause will return the rows active between <Start_Date_Time> and <End_Date_Time>. To understand it, let execute the following  query using the From..To sub clause with “SystemTime From ‘2018-04-20 09:02:25’ to ‘2018-04-20 10:14:56‘”.

select Name, ListPrice,rowguid,Product_Valid_From,Product_Valid_TO,ListPrice from DemoDatabase.dbo.tblProduct  FOR system_time from '2018-04-20 09:02:25 to '2018-04-20 10:13:56 where name =  'Flat Washer 8'

The following screenshot shows the query result:

BETWEEN <Start_Date_Time> And <End_Date_Time>

This clause is similar to the FROM..To clause. The only difference is that it will include the records that were active at <End_Date_Time>. To understand it, let’s execute the following query:

select Name, ListPrice,rowguid,Product_Valid_From,Product_Valid_TO,ListPrice from DemoDatabase.dbo.tblProduct  FOR system_time between '2018-04-20 09:02:25.1265684' and '2018-04-20 10:13:56.1265684' where name =  'Flat Washer 8'

The following screenshot shows the query result:

Included IN (<Start_Date_Time>, <End_Date_Time>)

This sub-clause will include the records which became active and ended within the specified date range. It does not include the active records. To understand it, execute below query using “Contained IN ‘2018-04-20 09:02:25to ‘2018-04-20 10:14:56’

select Name, ListPrice,rowguid,Product_Valid_From,Product_Valid_TO,ListPrice from DemoDatabase.dbo.tblProduct  FOR system_time Contained IN( '2018-04-20 09:02:25' , '2018-04-20 10:13:56 ') where name =  'Flat Washer 8'

The following screenshot shows the query result:

Scenario

An organization is using an inventory software. That inventory software uses a product table which is a system version temporal table. Due to an application bug, few products were deleted, and prices of the products were also wrongly updated.

As DBAs, we must investigate this issue and recover the data which was wrongly updated and deleted from the table.

To simulate the above scenario, let’s create a table with some meaningful data. I am going to create a new temporal table named ‘tblProduct’ on the Demo database which is a clone of the [Production].[Products] table of the AdventureWorks2014 database.

To perform above task, I have followed below steps:

  1. Extracted “create table script” [Production]. [Products] from the AdventureWorks2014 database.
  2. Removed all “constraints and indexes” from the script.
  3. Kept the column structure unchanged.
  4. To convert it to a temporal table, I  added SysStartTime and SysEndTime columns.
  5. Enabled System_Versioning.
  6. Specified history table.
  7. Executed the script on th edemo database.

Below is the  script:

USE [DemoDatabase]
GO
CREATE TABLE [tblProduct](
	[ProductID] [int] IDENTITY(1,1) Primary Key,
	[Name] varchar(500) NOT NULL,
	[ProductNumber] [nvarchar](25) NOT NULL,
	[Color] [nvarchar](15) NULL,
	[SafetyStockLevel] [smallint] NOT NULL,
	[ReorderPoint] [smallint] NOT NULL,
	[StandardCost] [money] NOT NULL,
	[ListPrice] [money] NOT NULL,
	[Size] [nvarchar](5) NULL,
	[SizeUnitMeasureCode] [nchar](3) NULL,
	[WeightUnitMeasureCode] [nchar](3) NULL,
	[Weight] [decimal](8, 2) NULL,
	[DaysToManufacture] [int] NOT NULL,
	[ProductLine] [nchar](2) NULL,
	[Class] [nchar](2) NULL,
	[Style] [nchar](2) NULL,
	[ProductSubcategoryID] [int] NULL,
	[ProductModelID] [int] NULL,
	[SellStartDate] [datetime] NOT NULL,
	[SellEndDate] [datetime] NULL,
	[DiscontinuedDate] [datetime] NULL,
	[rowguid] [uniqueidentifier] ROWGUIDCOL  NOT NULL,
	[ModifiedDate] [datetime] NOT NULL,
	Product_Valid_From datetime2 GENERATED ALWAYS AS ROW START NOT NULL
    , Product_Valid_TO datetime2 GENERATED ALWAYS AS ROW END NOT NULL
    , PERIOD FOR SYSTEM_TIME (Product_Valid_From,Product_Valid_TO)
 )
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE =dbo.Product_History));
GO

I have imported data from product table of the “AdventureWorks2014” database to the product table of “DemoDatabase” by executing the following script:

insert into DemoDatabase.dbo.tblProduct
(Name
,ProductNumber
,Color
,SafetyStockLevel
,ReorderPoint
,StandardCost
,ListPrice
,Size
,SizeUnitMeasureCode
,WeightUnitMeasureCode
,Weight
,DaysToManufacture
,ProductLine
,Class
,Style
,ProductSubcategoryID
,ProductModelID
,SellStartDate
,SellEndDate
,DiscontinuedDate
,rowguid
,ModifiedDate)
select top 50
Name
,ProductNumber
,Color
,SafetyStockLevel
,ReorderPoint
,StandardCost
,ListPrice
,Size
,SizeUnitMeasureCode
,WeightUnitMeasureCode
,Weight
,DaysToManufacture
,ProductLine
,Class
,Style
,ProductSubcategoryID
,ProductModelID
,SellStartDate
,SellEndDate
,DiscontinuedDate
,rowguid
,ModifiedDate
from AdventureWorks2014.Production.Product

I deleted the product name records that start with ‘Thin-Jam Hex Nut’ from tblProduct. I also changed the price of the products whtihc names  start with Flat Washer on ‘tblProduct’ table by executing the following query:

delete from DemoDatabase.dbo.Product where name like '%Thin-Jam Hex Nut%'
waitfor delay '00:01:00'
update DemoDatabase.dbo.tblProduct set ListPrice=500.00 where name like '%Flat Washer%'

We are aware of the time when data was deleted. Hence to identify, what data has been deleted, we will use Contained-IN sub-clause. As I mentioned above, it will give me the list of records that has row-versions which became active and ended within the specified date range. Then, executed below query:

declare @StartDateTime datetime
declare @EndDateTime datetime
set @StartDateTime=convert (datetime2, getdate()-1)
set @EndDateTime=convert (datetime2, getdate())
select ProductID, Name, ProductNumber,Product_Valid_From, Product_Valid_To from Product For SYSTEM_TIME Contained IN ( @StartDateTime , @EndDateTime)

By executing the above query, 22 rows have been retrieved.

The Contained-IN clause will populate the rows which were updated and deleted during the given time.

Populate Deleted records:

To populate the deleted records, we must skip the records which were updated during the time specified in Contained-IN clause. In below script, the “Where” clause will skip the products which are present in the tblProduct table. We will execute the following query:

declare @StartDateTime datetime
declare @EndDateTime datetime
set @StartDateTime=convert(datetime2,getdate()-1)
set @EndDateTime=convert(datetime2,getdate())

select ProductID, Name, ProductNumber,Product_Valid_From, Product_Valid_To from tblProduct For SYSTEM_TIME Contained IN ( @StartDateTime , @EndDateTime) Where Name not in (Select Name from tblProduct)

The above query has skipped the records which have been updated; hence it returned 13 rows. See below screenshot:

By using the above method, we will be able to get the list of products which have been deleted from the tblProduct table.

Populate Updated records

To populate the updated records, we must skip the records which were deleted during the time specified in the Contained-IN clause. In below script, the “Where” clause will include the products which are present in the tblProduct table. We will execute the following query:

 declare @StartDateTime datetime
declare @EndDateTime datetime
set @StartDateTime=convert(datetime2,getdate()-1)
set @EndDateTime=convert(datetime2,getdate())
select ProductID, Name, ProductNumber,Product_Valid_From, Product_Valid_To from tblProduct For SYSTEM_TIME Contained IN ( @StartDateTime , @EndDateTime) Where Name in (Select Name from tblProduct)

The above query has skipped the records which have been updated hence it returned 9 rows. See below screenshot:

Using above method, we will be able to identify the records which have been updated with wrong values and the the records which have been deleted from the temporal table.

Summary

In this article, I have covered:

  1. High-Level introduction of temporal tables.
  2. Explained, how period columns will be updated by executing DML queries.
  3. A demo to retrieve the list of products which has been deleted and updated with the wrong price, from the temporal table. This report can be used for auditing purpose.
Tags: , Last modified: September 22, 2021
Close