A Cartesian Product is a product of two sets of elements where each element in one set is multiplied by every other element of the other set. In simple words, if we have the following sets:
their Cartesian Product will be:
As you can see that the element 1 in set A is multiplied with all the elements 3,4 in the set B to become (1,3) and (1,4) and similarly the second element 2 is multiplied with all the elements in set B to from (2,3) and (2,4).
Cartesian Product is also known as Cross Product as the multiplication is applied across all the elements in one set with all the elements of the other set. Actually, the SQL name for Cartesian Product is a Cross Join. There, columns are elements, and rows are sets.
Assume we have two different tables in a SQL database, each has 3 rows. The Cartesian Product will have 9 rows in total.
Applying Cartesian Product (Cross Join) in SQL
You might have to handle disconnected or rather scattered data of examination bundled into a database. You’ll have to refine it further to make it more meaningful and understandable. How would you do? Let it be our case in this part of the article.
An Examination services provider company maintains the data in a SQL database. However, it sources that data from different places, and you have the following two tables:
- Student – it contains information about the registered students.
- Exam Type – the type of examination each student has to take once registered.
The problem is, that these two tables are completely unlinked, and you have to allocate each student all the available exam types without manually entering the data.
The proposed solution is to get the Cartesian Product of the two unlinked tables. This way, they get associated with each other the way the business wants it to see.
Setup Sample Exam Database
Let us set up a sample database called ExamCart. Run the following script against the master database:
-- Create sample database ExamCart USE MASTER GO CREATE DATABASE ExamCart GO USE ExamCart -- Creating a reference table ExamType CREATE TABLE dbo.ExamType ( ExamTypeId INT IDENTITY(1,1), Name VARCHAR(50)NOT NULL, Detail VARCHAR(200)NULL CONSTRAINT PK_ExamType_ExamTypeId PRIMARY KEY (ExamTypeId) ) GO -- Populating (adding rows to the) table ExamType SET IDENTITY_INSERT dbo.ExamType ON GO INSERT INTO dbo.ExamType ( ExamTypeId ,Name ,Detail ) VALUES ( 1 -- ID - INT Primary Key ,'Mid-Term Exam' -- Name - varchar(50) NOT NULL ,'This is Mid-Term Exam' -- Detail - varchar(200) ), ( 2 -- ID - INT Primary Key ,'Final Term Exam' -- Name - varchar(50) NOT NULL ,'This is Final Term Exam' -- Detail - varchar(200) ), ( 3 -- ID - INT Primary Key ,'First Term Exam' -- Name - varchar(50) NOT NULL ,'This is First Term Exam' -- Detail - varchar(200) ); GO SET IDENTITY_INSERT dbo.ExamType OFF GO
Then, the Student table for registered students should be added to the database. We need to do the same. Run the below script:
-- Connect to ExamCart database USE ExamCart -- Creating a reference table Student DROP TABLE if EXISTS dbo.Student CREATE TABLE dbo.Student ( StudentId INT IDENTITY(1,1), Name VARCHAR(50)NOT NULL, Detail VARCHAR(200)NULL CONSTRAINT PK_Student_StudentId PRIMARY KEY (StudentId) ) GO -- Populating (adding rows to the) table Student SET IDENTITY_INSERT dbo.Student ON GO INSERT INTO dbo.Student ( StudentId ,Name ,Detail ) VALUES ( 1 -- ID - INT Primary Key ,'Atif' -- Name - varchar(50) NOT NULL ,'This is Atif' -- Detail - varchar(200) ), ( 2 -- ID - INT Primary Key ,'Sam' -- Name - varchar(50) NOT NULL ,'This is Sam' -- Detail - varchar(200) ), ( 3 -- ID - INT Primary Key ,'Sarah' -- Name - varchar(50) NOT NULL ,'This is Sarah' -- Detail - varchar(200) ); GO SET IDENTITY_INSERT dbo.Student OFF GO
Check the Student table by executing the following script:
-- View Student table SELECT s.StudentId ,s.Name ,s.Detail FROM dbo.Student s
The output is as follows:
Note that I am using dbForge Studio for SQL Server to write and run the T-SQL scripts. The output (display) may look different if you are using a different too. However, query results will remain the same.
Similarly, let us see the contents of the ExamType table as follows:
-- View ExamType table data SELECT et.ExamTypeId, et.Name, et.Detail FROM dbo.ExamType et
The output is:
Applying the Cartesian Product of both tables
Our task is to assign each examination type to each student. To accomplish it, we need a Cartesian Product of both tables.
The following query will do the job:
-- View Cartesian Product of Student and ExamType SELECT s.StudentId,s.Name,s.Detail,et.ExamTypeId,et.Name AS Exam_Name, et.Detail AS Exam_Detail FROM dbo.Student s,dbo.ExamType et
The results are:
Applying Cartesian Product using Cross Join
We’ll achieve the same results using Cross Join:
-- View Product of Student and Exam table rows using Cross Join SELECT et.ExamTypeId,et.Name AS Exam_Name,et.Detail AS Exam_Detail, s.StudentId,s.Name,s.Detail FROM dbo.Student s CROSS JOIN dbo.ExamType et order BY et.ExamTypeId
The results are:
Creating a Cartesian Product – the StudentExam Table
A better way to store results physically is to create a table containing the result of those two tables’ products.
However, we need to create a distinct ID column to get it qualified for a primary key later on according to the standard practice of creating tables in a SQL database.
To create a unique ID, we are going to use ROW_NUMBER() function that assigns a new id starting from 1. This is achieved as follows:
--Creating a Cartesian Product StudentExam Table SELECT ROW_NUMBER() OVER (ORDER BY et.Name) as StudentExamId, et.ExamTypeId,et.Name AS Exam_Name,et.Detail AS Exam_Detail, s.StudentId,s.Name,s.Detail INTO StudentExam FROM dbo.Student s CROSS JOIN dbo.ExamType et
Let us expand the ExamCart databases node to locate the newly created StudentExam table (the Cartesian Product (Cross Join) result):
Remember that the Cross Join/Cartesian Product of two large tables could become a very resource-intensive operation. Thus, your SQL Server may require a lot of power to perform this operation. That can cause performance issues.
To resolve performance issues, you may have to scale up your Server resources (which is often easily possible in a cloud environment), but that may also come up with some more cost. However, it is a safe operation that does the job and can also denormalize the data (a standard practice) in a typical data warehouse business intelligence solution.
Congratulations! You have successfully learned about Cartesian Product in SQL along with the steps to implement it by working out your solution.
Things to Do
Now that you know how to use Cross Join (Cartesian Product) between two tables, try the following exercises to improve your skills:
- Try to add a new table called Subject with two columns SubjectId and Name. Then insert two subjects, SQL and Power BI, and run a Cross Join between Subject and the StudentExam table. You will see some interesting results.
- Set up the sample SQLDevBlogV5 database (refer to Creating Customer-Focused SSRS Reports with Parameters). Cross Join the ExamType table from this article with the Article table to see the results.
- Try applying the Cartesian Product of the Student table from this article with the Trainer table of the sample TechnicalTraining database (refer to Creating and Deploying Multiple Versions of Database through Schema Snapshots).