In my previous article, I explained the basics of set operators, their types, and prerequisites for their use. I also talked about UNION and UNION ALL operators, their usage and differences.
In this article, we’re going to learn the following:
- EXCEPT and INTERSECT operators.
- Difference between INTERSECT and INNER JOIN.
- The detailed explanation of INTERSECT and EXCEPT with an example.
EXCEPT and INTERSECT operators were introduced in SQL Server 2005. Both are set operators used to combine the result sets generated by two queries and retrieve the desired output.
What is the INTERSECT operator
INTERSECT is used to get records common to all data sets retrieved from multiple queries or tables. Here’s a visualization of this:
The syntax of the INTERSECT operator is a follows:
SELECT COLUMN1, COLUMN2, COLUMN3, COLUMN4..FROM TABLE1 INTERSECT SELECT COLUMN1, COLUMN2, COLUMN3, COLUMN4..FROM TABLE2
What is the EXCEPT operator
EXCEPT is used to retrieve records which are found in one query but not in another query. In other words, it returns records which are unique to one result set. This is what it looks like visualized:
The syntax of the EXCEPT operator is as follows:
SELECT COLUMN1, COLUMN2, COLUMN3, COLUMN4..FROM TABLE1 EXCEPT SELECT COLUMN1, COLUMN2, COLUMN3, COLUMN4..FROM TABLE2
Let’s create a demo setup to demonstrate how these operators can be used.
Demo Setup
To demonstrate INTERSECT and EXCEPT, I created two tables named Employee and Trainee.
Execute the following query to create these tables:
CREATE TABLE [DBO].[EMPLOYEE] ( [NAME] [NVARCHAR](250) NOT NULL, [BUSINESSENTITYID] [INT] NOT NULL, [NATIONALIDNUMBER] [NVARCHAR](15) NOT NULL, [LOGINID] [NVARCHAR](256) NOT NULL, [BIRTHDATE] [DATE] NOT NULL, [MARITALSTATUS] [NCHAR](1) NOT NULL, [GENDER] [NCHAR](1) NOT NULL ) ON [PRIMARY] CREATE TABLE [DBO].[TRAINEE] ( [NAME] [NVARCHAR](250) NOT NULL, [BUSINESSENTITYID] [INT] NOT NULL, [NATIONALIDNUMBER] [NVARCHAR](15) NOT NULL, [BIRTHDATE] [DATE] NOT NULL, [GENDER] [NCHAR](1) NOT NULL ) ON [PRIMARY]
Now, let’s insert some dummy data into the Employee table by executing the following query:
INSERT [DBO].[EMPLOYEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [LOGINID], [BIRTHDATE], [MARITALSTATUS], [GENDER]) VALUES (N'KEN SÁNCHEZ', 1, N'295847284', N'ADVENTURE-WORKS\KEN0', CAST(N'1969-01-29' AS DATE), N'S', N'M') GO INSERT [DBO].[EMPLOYEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [LOGINID], [BIRTHDATE], [MARITALSTATUS], [GENDER]) VALUES (N'TERRI DUFFY', 2, N'245797967', N'ADVENTURE-WORKS\TERRI0', CAST(N'1971-08-01' AS DATE), N'S', N'F') GO INSERT [DBO].[EMPLOYEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [LOGINID], [BIRTHDATE], [MARITALSTATUS], [GENDER]) VALUES (N'ROBERTO TAMBURELLO', 3, N'509647174', N'ADVENTURE-WORKS\ROBERTO0', CAST(N'1974-11-12' AS DATE), N'M', N'M') GO INSERT [DBO].[EMPLOYEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [LOGINID], [BIRTHDATE], [MARITALSTATUS], [GENDER]) VALUES (N'ROB WALTERS', 4, N'112457891', N'ADVENTURE-WORKS\ROB0', CAST(N'1974-12-23' AS DATE), N'S', N'M') GO INSERT [DBO].[EMPLOYEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [LOGINID], [BIRTHDATE], [MARITALSTATUS], [GENDER]) VALUES (N'GAIL ERICKSON', 5, N'695256908', N'ADVENTURE-WORKS\GAIL0', CAST(N'1952-09-27' AS DATE), N'M', N'F') GO INSERT [DBO].[EMPLOYEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [LOGINID], [BIRTHDATE], [MARITALSTATUS], [GENDER]) VALUES (N'JOSSEF GOLDBERG', 6, N'998320692', N'ADVENTURE-WORKS\JOSSEF0', CAST(N'1959-03-11' AS DATE), N'M', N'M')
Next, we’ll do the same thing for the Trainee table:
INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'JOHN WOOD', 18, N'222969461', CAST(N'1978-03-06' AS DATE), N'M') GO INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'MARY DEMPSEY', 19, N'52541318', CAST(N'1978-01-29' AS DATE), N'F') GO INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'WANIDA BENSHOOF', 20, N'323403273', CAST(N'1975-03-17' AS DATE), N'F') GO INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'KEN SÁNCHEZ', 1, N'295847284', CAST(N'1969-01-29' AS DATE), N'M') GO INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'TERRI DUFFY', 2, N'245797967', CAST(N'1971-08-01' AS DATE), N'F') GO INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'ROBERTO TAMBURELLO', 3, N'509647174', CAST(N'1974-11-12' AS DATE), N'M') GO
Now, let’s use INTERSECT to retrieve the list of employees which are common to both tables. To do that, run the following query:
SELECT NAME, BUSINESSENTITYID, NATIONALIDNUMBER, BIRTHDATE, GENDER FROM EMPLOYEE INTERSECT SELECT NAME, BUSINESSENTITYID, NATIONALIDNUMBER, BIRTHDATE, GENDER FROM TRAINEE
The output of this query should be as follows:
As you can see in the screenshot above, the query has only returned records which are common to both tables.
INNER JOIN vs. INTERSECT
In most cases, INTERSECT and INNER JOIN return the same output, but there are some exceptions. A simple example will help us understand this.
Let’s add some duplicate records to the Trainee table. Execute the following query:
INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'TERRI DUFFY', 2, N'245797967', CAST(N'1971-08-01' AS DATE), N'F') GO INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'ROBERTO TAMBURELLO', 3, N'509647174', CAST(N'1974-11-12' AS DATE), N'M') GO
Now, we’ll try to generate the desired output using INTERSECT.
SELECT NAME,BUSINESSENTITYID,NATIONALIDNUMBER,BIRTHDATE,GENDER FROM EMPLOYEE INTERSECT SELECT NAME,BUSINESSENTITYID,NATIONALIDNUMBER,BIRTHDATE,GENDER FROM TRAINEE
This is the output we get:
Now, let’s try using INNER JOIN.
SELECT A.NAME, A.BUSINESSENTITYID, A.NATIONALIDNUMBER, A.BIRTHDATE, A.GENDER FROM EMPLOYEE A INNER JOIN TRAINEE B ON A.NAME = B.NAME
The output we get in this case is as follows:
Now, as you can see on the screenshot above, INNER JOIN retrieves records which are common to both tables. It populates all records from the right table. Therefore, you can see duplicate records.
Now, let’s add the DISTINCT keyword to the INNER JOIN query and look at what this does:
SELECT DISTINCT A.NAME, A.BUSINESSENTITYID, A.NATIONALIDNUMBER, A.BIRTHDATE, A.GENDER FROM EMPLOYEE A INNER JOIN TRAINEE B ON A.NAME = B.NAME
The output should look like this:
As you can see on the screenshot above, duplicate records have been eliminated.
INTERSECT and INNER JOIN treat NULL values differently. For INNER JOIN, two NULL values are different, so there are chances that it will skip them while joining two tables.
On the other hand, INTERSECT treats two NULL values as being the same, so records that have NULL values won’t be eliminated. To understand it better, let’s look at an example.
First, let’s add some NULL values to the Trainee and Employee tables by executing the following query:
INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (NULL, 3, N'509647174', CAST(N'1974-11-12' AS DATE), N'M') GO INSERT [DBO].[Employee] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER],[LOGINID], [BIRTHDATE],[MARITALSTATUS], [GENDER]) VALUES (NULL, 3, N'509647174','ADVENTURE-WORKS\TERRI0', CAST(N'1974-11-12' AS DATE), N'M',N'M') GO
Now let’s try to retrieve records common to the two tables using INTERSECT and INNER JOIN. You will need to execute the following query:
/*QUERY WITH INTERSECT*/ SELECT NAME, BUSINESSENTITYID, NATIONALIDNUMBER, BIRTHDATE, GENDER FROM EMPLOYEE INTERSECT SELECT NAME, BUSINESSENTITYID, NATIONALIDNUMBER, BIRTHDATE, GENDER FROM TRAINEE /*QUERY WITH INNER JOIN*/ SELECT A.NAME, A.BUSINESSENTITYID, A.NATIONALIDNUMBER, A.BIRTHDATE, A.GENDER FROM EMPLOYEE A INNER JOIN TRAINEE B ON A.NAME = B.NAME
This is the output we should get as a result:
As you can see above, the result set generated by INTERSECT contains NULL values, while INNER JOIN skipped the records that have NULL values.
The EXCEPT Operator
To demonstrate the EXCEPT operator in action, let’s look at a use case. For example, I want to populate the details of female employees from the Employee table. The following query will help us do just that:
SELECT NAME, BUSINESSENTITYID, NATIONALIDNUMBER, BIRTHDATE, GENDER FROM EMPLOYEE WHERE GENDER = 'F' EXCEPT SELECT NAME, BUSINESSENTITYID, NATIONALIDNUMBER, BIRTHDATE, GENDER FROM EMPLOYEE WHERE GENDER = 'M'
This is the output we get:
As you can see above, the query populated only the female employees’ details.
You can also populate the result set using a sub-query:
SELECT NAME, BUSINESSENTITYID, NATIONALIDNUMBER, BIRTHDATE, GENDER FROM EMPLOYEE AS M WHERE GENDER = 'F' AND GENDER NOT IN (SELECT GENDER FROM EMPLOYEE AS F WHERE GENDER = 'M')
Limitations of INTERSECT and EXCEPT
- We cannot use EXCEPT and INTERSECT in distributed partitioned view definitions with COMPUTE and COMPUTE BY clauses.
- EXCEPT and INTERSECT can be used in Fast forward-only and static cursors.
- EXCEPT and INTERSECT can be used in distributed queries, but can only be executed on the local server. You cannot run them on a remote server.
Summary
In this article, I have covered:
- The EXCEPT and INTERSECT operators.
- The difference between INTERSECT and INNER JOIN.
- A detailed explanation of the INTERSECT and EXCEPT operators with an example.