In professional work scenarios, we often need to load data from one report to several destinations, such as Excel files and SQL databases. One of the ways to accomplish this is using a Multicast Transformation in SSIS which accepts the data input and splits it into several outputs, creating logical data copies. As a result, you get several result copies to apply them to different purposes.
The current article clarifies the steps of gathering information, identifying problems, and designing the solution’s initial phase. After that, we’ll test-run the Multicast Transformation.
You may refer to the following articles for more details:
- How to Set up and Test-Run the Environment to Multicast Data with SSIS
- Learn to Multicast Data using SSIS Packages
The Case Scenario
There is a company where the internal staff and top management view and track the reported data in the form of Excel files. These files must be updated daily from the reporting databases. The SQL database must log every single update of an Excel sheet by the system. The Excel report files, in their turn, must reside in the common Windows folder accessed by the authorized staff.
Thus, our case suggests that there is one input – the data from the reporting database. But there must be two outputs, the Excel files, and the SQL Database Log updates. The solution will be designed by the Multicast Transformation in an SSIS Package.
The tasks to perform in this scenario are the following:
- Set up the reporting database.
- Set up the SQL Log database.
- Set up an Integration Services Project.
- Allocate the common Windows folder for the Excel spreadsheets that get updated with time.
- Get the Excel file with the required reporting columns.
Set up the Reporting Database (ToyStoreDW)
In the ToyStoreDW reporting database, we have a single table called Toy with the following information:
- ToyId – the particular toy;
- Name – the name of the toy;
- Price – the cost of the toy;
- Stock – the number of toys in the stock;
- Sold – the total number of orders placed by customers for a particular toy.
Connect to your SQL Server instance and set up the sample database by executing the following T-SQL script against the master database:
-- Create sample database ToysStoreDW
USE MASTER
GO
CREATE DATABASE ToysStoreDW
GO
USE ToysStoreDW
-- Creating a table Toy
CREATE TABLE dbo.Toy
(
ToyId INT IDENTITY(1,1),
Name VARCHAR(50) NOT NULL ,
Price DECIMAL(5,2) NOT NULL,
Stock INT NOT NULL,
Sold INT NOT NULL
CONSTRAINT PK_Toy_ToyId PRIMARY KEY (ToyId)
)
GO
-- Populating table Toy
INSERT INTO dbo.Toy
(Name, Price, Stock, Sold)
VALUES
('Colouring Pencils', 5.00, 160, 50),
('Activity Books', 10.00, 200, 70),
('Programmable Toy Car', 100.00, 30, 40),
('Building Blocks', 10.00, 100, 80),
('Theme Park Playset', 65.00, 120, 30),
('Build a Robot Kit',50.00,200,60),
('Flying Toy Car',75.00,50,10),
('Science Experiments Set',35.00,100,20),
('Indoor Sports Set',20.00,50,25),
('Castle Playset',25.00,100,47)
In our case scenario, we need to identify the top 5 bestselling toys. Since this is a reporting database (not a traditional transactional database), the best way is to write the SQL view giving us the required information.
Run the following script against the sample database ToysStoreDW:
-- Creating a SQL view to see top 5 best sellers (toys)
CREATE VIEW Top5ToysView
AS
SELECT TOP 5 * FROM dbo.Toy T
ORDER BY T.Sold DESC;
GO
Now, we need to check it. Run this newly created view against the sample database to see the results:
-- Connect to the reporting database
USE ToysStoreDW
-- View top 5 bestselling toys
SELECT * FROM dbo.Top5ToysView
Note: I am using dbForge Studio for SQL Server to write and run scripts. Your output may look different if you work with another tool, such as SQL Server Management Studio. However, the result set will be the same.
Set up the SQL Log Database (ToysStoreDWLog)
This SQL database is going to log the reporting data output results sent to Excel files via SSIS Package. Therefore, we can keep track of those files.
Execute the following T-SQL script:
-- Create sample database ToysStoreDWLog
USE master
GO
CREATE DATABASE ToysStoreDWLog
GO
USE ToysStoreDWLog
-- Creating a log table
CREATE TABLE ReportLog (
LogId INT IDENTITY (1, 1)
,LogDateTime DATETIME2
,ToyWithId VARCHAR(150)
CONSTRAINT PK_LogId PRIMARY KEY (LogId)
)
GO
Now we must ensure that the reporting data is logged correctly once it is sent to the Excel destination. In our example, we focus on the following data sent to Excel:
- The date of sending it to the Excel output.
- The Toy Id with Name, starting from the Top 1.
Write and execute the following script against the log database (ToyStoreDWLog):
-- Check the strategy to log data
-- Connect ToysStoreDWLog
USE ToysStoreDWLog
DECLARE @LogDate DATETIME = GETDATE() -- Initialise the variable with current date and time
DECLARE @ToyWithId VARCHAR(150) = '6 Build a Robot Kit' -- Initialise the variable with the top-selling toys information coming from the source
SELECT
@LogDate AS LogDate -- view the current date as LogDate
-- Insert a record into the Log table as if it was populated through the package
INSERT INTO dbo.ReportLog (LogDateTime, ToyWithId)
VALUES (@LogDate, @ToyWithId)
-- View the logged data
SELECT
RL.LogId
,RL.LogDateTime
,RL.ToyWithId
FROM dbo.ReportLog RL
The outputs are:
Set up an Integration Services Project
Open Visual Studio and create a new Integration Services Project MulticastingExcelSQLSSIS under a new solution called Multicasting Scenario Solution:
Right-click the package (Package.dtsx) under the SSIS Packages folder and rename it to multicast excel SQL:
Allocate the Common Windows Folder
Choose the common windows folder carefully, as it has to be accessible by all authorized staff members.
Get the Excel File with Required Reporting Columns
Create a blank Excel file and a table to hold the reporting data that must be readily available for the SSIS Package.
Create a new Excel file and name it Toys Report Data:
Create a table with the following columns: ToyId, Name, Price, Stock, and Sold.
Congratulations! We have set the scenario up for multicasting the report data to Excel and SQL database destinations.
Now, we are going to use that data to design the Multicast Transformation solution and test-run it.
Create the Source Connection (ToysStoreDW)
The first thing to do is to connect to the source ToysStoreDW database.
Open the Integration Services Project in Visual Studio (SQL Server Data Tools) and double-click the multicast excel SQL package to open it.
Once the Control Flow/Data Flow tab is active, right-click anywhere in the Connection Managers area and click New OLE DB Connection…Then configure it to point to the ToysStoreDW database:
Rename the connection to ToysStoreDW – Source:
Create the SQL Database Log Connection (Destination 2)
Create another connection pointing to the log database (ToysStoreDWLog). It helps us to track the report data.
Note: I am using a slightly different provider this time, while the database is hosted locally. So, it is fine to use the Native OLE DB provider or the one used previously.
Test the connection and rename it as ToysStoreDWLog – Destination 2:
Create the Connection for the Excel File to Hold Reporting Data (Destination 1)
This time, create a new connection as shown below:
Select EXCEL from the list that appears once you click New Connection:
Locate the Excel file we created at the previous stage. This file should contain the desired table structure sheet:
Rename it as Report Data in Excel – Destination 1:
Creating the Data Flow Task to Add the Required Functionality
We need to create the data flow task in the Control Flow tab to start designing the package according to the requirements.
Drag a Data Flow Task from the SSIS Toolbox and drop it onto the surface of Control Flow. Name it DFT Toys Report Data:
Remember to keep saving the project! You have to ensure that you won’t lose any progress if your Visual Studio becomes unresponsive for any reason.
Double-click the Data Flow Task (DFT Toys Report Data) to switch to the Data Flow tab.
Bring the OLE DB Source component to the Data Flow designer surface:
Now, double-click the component and select the ToysStoreDWsource database and the Top5ToysView view in the Name of the table or the view drop-down box:
Rename the component as SRC Top5ToysView:
Drag and drop the Derived Column Transformation and insert the following new derived columns values as follows:
Column Name Expression Length
ExcelName Name 50
StringToyId Name + " (Toy Id: " + (DT_STR,5,1252)ToyId + ")" 71
Note: We have added the two new derived columns because the Name column in the source table is VARCHAR(), while Excel accepts names like columns in Unicode (NVARCHAR). That’s why we have created the ExcelName derived column to avoid the problem of mismatching data types when transferring data.
The second derived column StringToyId is joining the ToyId with the Name columns, creating a useful piece of information for the log file. The data will be sent to that file as a result of Multicast Transformation.
Create a package-level variable called LogDateTimeof the DateTime type. It is going to store the date of the particular reporting data stream. As bestselling toys may change frequently, even on daily basis, this variable helps us to see what was the list like on a particular date.
Name: LogDateTime
Date type: DateTime
Expression: GETDATE()
However, we need another derived column to turn the Unicode (NVARCHAR) StringToyId (derived column) into the simple string type VARCHAR. It is necessary for sending the data to the SQL Log database because the recipient column in the target database is in VARCHAR (not in Unicode).
We also need to add the LogDateTime derived column to use it for SQL Log Database.
Add another Derived Column Transformation to convert StringToyId derived column into SQLStringToyId. Add the derived LogDateTime column to make it compatible with the SQL log database as follows:
Derived Column Name:
SQLStringToyId
Expression:
(DT,STR,70,1252)StringToyId
Derived Column Name:
LogDateTime
Expression:
@[User::LogDateTime]
Name this transformation DRV SQLToyIdWithName:
Remember that we could have used Data Conversion Transformation. However, Derived Column Transformation serves the purpose as well.
Add the Multicast Transformation, name it Multicast Excel and SQL Log, and join the last Derived Column Transformation with it:
Now, add the following components from the SSIS Toolbox and join our Multicast Transformation with both of them:
- Excel Destination (to load data into the Excel file).
- OLE DB Destination (to load data into SQL Database Log).
Double-click Excel Destination and point to the desired Excel sheet:
Switch to the Mappings section and map StringToyId from the Input Columns with the Name column in the Destinations. The rest of the mappings should be untouched as shown below:
Next, double-click the OLE DB Destination component. Select the required destination and table:
Map the columns so that LogDateTime (derived column) should be mapped with the LogDateTime SQL database table column, and SQLStringToyId (derived column) should be mapped with the ToyWithId column of the SQL log database (ToysStoreDWLog) table (ReportLog):
Rename Excel Destination to DST Excel Report, and OLE DB Destination to DST SQL Log.
Explore everything about SSIS components.
Run the SSIS Multicast Transformation Package
Let us run the package to see it in action:
Check Both Destinations
Go to the location where the Excel file is stored:
Open the Excel file and see the results:
Now run the following T-SQL script against the ToysStoreDWLog database to view the contents of the table after we ran the Multicast Transformation:
-- Connect to the Log database
USE ToysStoreDWLog
-- Check ReportData table after multicasting data package run
SELECT * FROM dbo.ReportLog
order BY LogId
Congratulations! You have mastered using SSIS Multicast Transformation to send the report data to Excel and log the database. Now you can keep track of what is happening with the report data.
Things to do
Try the following things to improve your skills further:
- Rerun the SSIS Package and see results for yourself, as you are going to observe two different instances of the top toys.
- Update the Toy table in the source database (ToysStoreDW) by making Colouring Pencils,Castle Playset,and Activity Bookstop 3 bestselling toys (increase their respective Sold column value). Then rerun the package to see the updated results.
- Try adding and testing another SQL view to get the Top 5 least-selling toys.
- Create another environment to send the report data to multiple destinations based on the sample source database CoalesceUseDB mentioned in this article.