Written by 10:26 Business intelligence, SSIS

Sending Report Data to Excel and SQL Database using Multicast Transformation in SSIS Packages

CodingSight - Sending Report Data to Excel and SQL Database using Multicast Transformation in SSIS Packages

In professional work scenarios, we often need to load data from one report to several destinations, such as Excel files and SQL databases. One of the ways to accomplish this is using a Multicast Transformation in SSIS which accepts the data input and splits it into several outputs, creating logical data copies. As a result, you get several result copies to apply them to different purposes.

The current article clarifies the steps of gathering information, identifying problems, and designing the solution’s initial phase. After that, we’ll test-run the Multicast Transformation.

You may refer to the following articles for more details:

The Case Scenario

There is a company where the internal staff and top management view and track the reported data in the form of Excel files. These files must be updated daily from the reporting databases. The SQL database must log every single update of an Excel sheet by the system. The Excel report files, in their turn, must reside in the common Windows folder accessed by the authorized staff.

Thus, our case suggests that there is one input – the data from the reporting database. But there must be two outputs, the Excel files, and the SQL Database Log updates. The solution will be designed by the Multicast Transformation in an SSIS Package.

Sending Report Data to Excel and SQL Database using Multicast Transformation

The tasks to perform in this scenario are the following:

  • Set up the reporting database.
  • Set up the SQL Log database.
  • Set up an Integration Services Project.
  • Allocate the common Windows folder for the Excel spreadsheets that get updated with time.
  • Get the Excel file with the required reporting columns.

Set up the Reporting Database (ToyStoreDW)

In the ToyStoreDW reporting database, we have a single table called Toy with the following information:

  • ToyId – the particular toy;
  • Name – the name of the toy;
  • Price – the cost of the toy;
  • Stock – the number of toys in the stock;
  • Sold – the total number of orders placed by customers for a particular toy.

Connect to your SQL Server instance and set up the sample database by executing the following T-SQL script against the master database:

-- Create sample database ToysStoreDW
USE MASTER
GO

CREATE DATABASE ToysStoreDW
GO

USE ToysStoreDW

-- Creating a table Toy
CREATE TABLE dbo.Toy
(
 ToyId INT IDENTITY(1,1),
 Name VARCHAR(50) NOT NULL ,
  Price DECIMAL(5,2) NOT NULL,
  Stock INT NOT NULL,
  Sold INT NOT NULL
        CONSTRAINT PK_Toy_ToyId PRIMARY KEY (ToyId)
)
GO

-- Populating table Toy
INSERT INTO dbo.Toy 
(Name, Price, Stock, Sold)
  VALUES 
  ('Colouring Pencils', 5.00, 160, 50),
  ('Activity Books', 10.00, 200, 70),
  ('Programmable Toy Car', 100.00, 30, 40),
  ('Building Blocks', 10.00, 100, 80),
  ('Theme Park Playset', 65.00, 120, 30),
  ('Build a Robot Kit',50.00,200,60),
  ('Flying Toy Car',75.00,50,10),
  ('Science Experiments Set',35.00,100,20),
  ('Indoor Sports Set',20.00,50,25),
  ('Castle Playset',25.00,100,47)

In our case scenario, we need to identify the top 5 bestselling toys. Since this is a reporting database (not a traditional transactional database), the best way is to write the SQL view giving us the required information.

Run the following script against the sample database ToysStoreDW:

-- Creating a SQL view to see top 5 best sellers (toys)
CREATE VIEW Top5ToysView
AS
SELECT TOP 5 * FROM dbo.Toy T
ORDER BY T.Sold DESC;
GO

Now, we need to check it. Run this newly created view against the sample database to see the results:

-- Connect to the reporting database
USE ToysStoreDW

-- View top 5 bestselling toys
SELECT * FROM dbo.Top5ToysView
Sample Database with Top 5 Bestselling Toys

Note: I am using dbForge Studio for SQL Server to write and run scripts. Your output may look different if you work with another tool, such as SQL Server Management Studio. However, the result set will be the same.

Set up the SQL Log Database (ToysStoreDWLog)

This SQL database is going to log the reporting data output results sent to Excel files via SSIS Package. Therefore, we can keep track of those files.

Execute the following T-SQL script:

-- Create sample database ToysStoreDWLog
USE master
GO

CREATE DATABASE ToysStoreDWLog
GO

USE ToysStoreDWLog

-- Creating a log table
CREATE TABLE ReportLog (
  LogId INT IDENTITY (1, 1)
 ,LogDateTime DATETIME2
 ,ToyWithId VARCHAR(150)
  CONSTRAINT PK_LogId PRIMARY KEY (LogId)
)
GO

Now we must ensure that the reporting data is logged correctly once it is sent to the Excel destination. In our example, we focus on the following data sent to Excel:

  1. The date of sending it to the Excel output.
  2. The Toy Id with Name, starting from the Top 1.

Write and execute the following script against the log database (ToyStoreDWLog):

-- Check the strategy to log data

-- Connect ToysStoreDWLog
USE ToysStoreDWLog

DECLARE @LogDate DATETIME = GETDATE() -- Initialise the variable with current date and time 
DECLARE @ToyWithId VARCHAR(150) = '6 Build a Robot Kit' -- Initialise the variable with the top-selling toys information coming from the source


SELECT
  @LogDate AS LogDate -- view the current date as LogDate

-- Insert a record into the Log table as if it was populated through the package 
INSERT INTO dbo.ReportLog (LogDateTime, ToyWithId)
  VALUES (@LogDate, @ToyWithId)

-- View the logged data
SELECT
  RL.LogId
 ,RL.LogDateTime
 ,RL.ToyWithId
FROM dbo.ReportLog RL

The outputs are:

Storing Current Date and Time
Logged Data Check Based on Current Date and Time

Set up an Integration Services Project

Open Visual Studio and create a new Integration Services Project MulticastingExcelSQLSSIS under a new solution called Multicasting Scenario Solution:

Set up an Integration Services Project to Multicast Data

Right-click the package (Package.dtsx) under the SSIS Packages folder and rename it to multicast excel SQL:

Package Renamed

Allocate the Common Windows Folder

Choose the common windows folder carefully, as it has to be accessible by all authorized staff members.

Common Reports Folder to host Excel File that Contains Reporting Data

Get the Excel File with Required Reporting Columns

Create a blank Excel file and a table to hold the reporting data that must be readily available for the SSIS Package.

Create a new Excel file and name it Toys Report Data:

Creating a blank Excel file to load the reporting data from the SSIS Package Multicast Transformation.

Create a table with the following columns: ToyId, Name, Price, Stock, and Sold.

Table to Hold Top Selling Toys Report Data

Congratulations! We have set the scenario up for multicasting the report data to Excel and SQL database destinations.

Now, we are going to use that data to design the Multicast Transformation solution and test-run it.

Create the Source Connection (ToysStoreDW)

The first thing to do is to connect to the source ToysStoreDW database.

Open the Integration Services Project in Visual Studio (SQL Server Data Tools) and double-click the multicast excel SQL package to open it.

Once the Control Flow/Data Flow tab is active, right-click anywhere in the Connection Managers area and click New OLE DB Connection…Then configure it to point to the ToysStoreDW database:

Create the Source Connection

Rename the connection to ToysStoreDW – Source:

Rename the connection to ToysStoreDW

Create the SQL Database Log Connection (Destination 2)

Create another connection pointing to the log database (ToysStoreDWLog). It helps us to track the report data.

Create another connection pointing to the log database (ToysStoreDWLog)

Note: I am using a slightly different provider this time, while the database is hosted locally. So, it is fine to use the Native OLE DB provider or the one used previously.

Test the connection and rename it as ToysStoreDWLog – Destination 2:

Test the connection and rename it as ToysStoreDWLog - Destination 2

Create the Connection for the Excel File to Hold Reporting Data (Destination 1)

This time, create a new connection as shown below:

Creating Excel Connection

Select EXCEL from the list that appears once you click New Connection:

Select EXCEL from the list that appears once you click New Connection

Locate the Excel file we created at the previous stage. This file should contain the desired table structure sheet:

Pointing  to the blank Excel file

Rename it as Report Data in Excel – Destination 1:

Rename it as Report Data in Excel – Destination 1

Creating the Data Flow Task to Add the Required Functionality

We need to create the data flow task in the Control Flow tab to start designing the package according to the requirements.

Drag a Data Flow Task from the SSIS Toolbox and drop it onto the surface of Control Flow. Name it DFT Toys Report Data:

Creating the Data Flow Task to add the required functionality

Remember to keep saving the project! You have to ensure that you won’t lose any progress if your Visual Studio becomes unresponsive for any reason.

Double-click the Data Flow Task (DFT Toys Report Data) to switch to the Data Flow tab.

Bring the OLE DB Source component to the Data Flow designer surface:

Bring the OLE DB Source component to the Data Flow designer surface

Now, double-click the component and select the ToysStoreDWsource database and the Top5ToysView view in the Name of the table or the view drop-down box:

select the ToysStoreDW source database

Rename the component as SRC Top5ToysView:

Rename the component as SRC Top5ToysView

Drag and drop the Derived Column Transformation and insert the following new derived columns values as follows:

Column Name             Expression                                                      Length
ExcelName               Name                                                            50
StringToyId             Name + " (Toy Id: " + (DT_STR,5,1252)ToyId + ")"                71
Drag and drop the Derived Column Transformation and insert the following new derived columns values

Note: We have added the two new derived columns because the Name column in the source table is VARCHAR(), while Excel accepts names like columns in Unicode (NVARCHAR). That’s why we have created the ExcelName derived column to avoid the problem of mismatching data types when transferring data.

The second derived column StringToyId is joining the ToyId with the Name columns, creating a useful piece of information for the log file. The data will be sent to that file as a result of Multicast Transformation.

Create a package-level variable called LogDateTimeof the DateTime type. It is going to store the date of the particular reporting data stream. As bestselling toys may change frequently, even on daily basis, this variable helps us to see what was the list like on a particular date.

Name: LogDateTime
Date type: DateTime
Expression: GETDATE()
Create a package-level variable called LogDateTime of the DateTime type.

However, we need another derived column to turn the Unicode (NVARCHAR) StringToyId (derived column) into the simple string type VARCHAR. It is necessary for sending the data to the SQL Log database because the recipient column in the target database is in VARCHAR (not in Unicode).

We also need to add the LogDateTime derived column to use it for SQL Log Database.

Add another Derived Column Transformation to convert StringToyId derived column into SQLStringToyId. Add the derived LogDateTime column to make it compatible with the SQL log database as follows:

Derived Column Name:
SQLStringToyId
Expression:
(DT,STR,70,1252)StringToyId

Derived Column Name:
LogDateTime
Expression:
@[User::LogDateTime]
Add another Derived Column Transformation to convert StringToyId derived column into SQLStringToyId.

Name this transformation DRV SQLToyIdWithName:

Two Derived Column Transformation

Remember that we could have used Data Conversion Transformation. However, Derived Column Transformation serves the purpose as well.

Add the Multicast Transformation, name it Multicast Excel and SQL Log, and join the last Derived Column Transformation with it:

Adding Multicast Transformation to send data to two destinations

Now, add the following components from the SSIS Toolbox and join our Multicast Transformation with both of them:

  1. Excel Destination (to load data into the Excel file).
  2. OLE DB Destination (to load data into SQL Database Log).
adding components from the SSIS Toolbox and joining Multicast Transformation with both of them

Double-click Excel Destination and point to the desired Excel sheet:

Double-click Excel Destination and point to the desired Excel sheet

Switch to the Mappings section and map StringToyId from the Input Columns with the Name column in the Destinations. The rest of the mappings should be untouched as shown below:

Mapping StringToyId with Name column in Excel file

Next, double-click the OLE DB Destination component. Select the required destination and table:

double-click the OLE DB Destination component

Map the columns so that LogDateTime (derived column) should be mapped with the LogDateTime SQL database table column, and SQLStringToyId (derived column) should be mapped with the ToyWithId column of the SQL log database (ToysStoreDWLog) table (ReportLog):

Configure the properties used to insert data into a relational database using an OLE DB Provider

Rename Excel Destination to DST Excel Report, and OLE DB Destination to DST SQL Log.

Run the SSIS Multicast Transformation Package

Let us run the package to see it in action:

Run the Multicast Transformation Package

Check Both Destinations

Go to the location where the Excel file is stored:

Go to the location where the Excel file is stored

Open the Excel file and see the results:

Open the Excel file and see the results

Now run the following T-SQL script against the ToysStoreDWLog database to view the contents of the table after we ran the Multicast Transformation:

-- Connect to the Log database
USE ToysStoreDWLog

-- Check ReportData table after multicasting data package run 
SELECT * FROM dbo.ReportLog
order BY LogId
SQL Database Log was successfully populated

Congratulations! You have mastered using SSIS Multicast Transformation to send the report data to Excel and log the database. Now you can keep track of what is happening with the report data.

Things to do

Try the following things to improve your skills further:

  1. Rerun the SSIS Package and see results for yourself, as you are going to observe two different instances of the top toys.
  2. Update the Toy table in the source database (ToysStoreDW) by making Colouring Pencils,Castle Playset,and Activity Bookstop 3 bestselling toys (increase their respective Sold column value). Then rerun the package to see the updated results.
  3. Try adding and testing another SQL view to get the Top 5 least-selling toys.
  4. Create another environment to send the report data to multiple destinations based on the sample source database CoalesceUseDB mentioned in this article.
Tags: , , Last modified: September 30, 2021
Close