T-SQL SET Operators Part 2: INTERSECT and EXCEPT

Total: 12 Average: 3.7

In my previous article, I explained the basics of set operators, their types, and prerequisites for their use. I also talked about UNION and UNION ALL operators, their usage and differences.

In this article, we’re going to learn the following:

  1. EXCEPT and INTERSECT operators.
  2. Difference between INTERSECT and INNER JOIN.
  3. The detailed explanation of INTERSECT and EXCEPT with an example.

EXCEPT and INTERSECT operators were introduced in SQL Server 2005. Both are set operators used to combine the result sets generated by two queries and retrieve the desired output.

What is the INTERSECT operator

INTERSECT is used to get records common to all data sets retrieved from multiple queries or tables. Here’s a visualization of this:

INTERSECT Diagram

The syntax of the INTERSECT operator is a follows:

SELECT COLUMN1, 
       COLUMN2, 
       COLUMN3, 
       COLUMN4..FROM TABLE1 
INTERSECT
SELECT COLUMN1, 
       COLUMN2, 
       COLUMN3, 
       COLUMN4..FROM TABLE2

What is the EXCEPT operator

EXCEPT is used to retrieve records which are found in one query but not in another query. In other words, it returns records which are unique to one result set. This is what it looks like visualized:

EXCEPT Diagram

The syntax of the EXCEPT operator is as follows:

SELECT COLUMN1, 
       COLUMN2, 
       COLUMN3, 
       COLUMN4..FROM TABLE1 
EXCEPT
SELECT COLUMN1, 
       COLUMN2, 
       COLUMN3, 
       COLUMN4..FROM TABLE2

Let’s create a demo setup to demonstrate how these operators can be used.

Demo Setup

To demonstrate INTERSECT and EXCEPT, I created two tables named Employee and Trainee.

Execute the following query to create these tables:

CREATE TABLE [DBO].[EMPLOYEE] 
  ( 
     [NAME]             [NVARCHAR](250) NOT NULL, 
     [BUSINESSENTITYID] [INT] NOT NULL, 
     [NATIONALIDNUMBER] [NVARCHAR](15) NOT NULL, 
     [LOGINID]          [NVARCHAR](256) NOT NULL, 
     [BIRTHDATE]        [DATE] NOT NULL, 
     [MARITALSTATUS]    [NCHAR](1) NOT NULL, 
     [GENDER]           [NCHAR](1) NOT NULL 
  ) 
ON [PRIMARY] 

CREATE TABLE [DBO].[TRAINEE] 
  ( 
     [NAME]             [NVARCHAR](250) NOT NULL, 
     [BUSINESSENTITYID] [INT] NOT NULL, 
     [NATIONALIDNUMBER] [NVARCHAR](15) NOT NULL, 
     [BIRTHDATE]        [DATE] NOT NULL, 
     [GENDER]           [NCHAR](1) NOT NULL 
  ) 
ON [PRIMARY]

Now, let’s insert some dummy data into the Employee table by executing the following query:

INSERT [DBO].[EMPLOYEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [LOGINID], [BIRTHDATE], [MARITALSTATUS], [GENDER]) VALUES (N'KEN SÁNCHEZ', 1, N'295847284', N'ADVENTURE-WORKS\KEN0', CAST(N'1969-01-29' AS DATE), N'S', N'M')
GO
INSERT [DBO].[EMPLOYEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [LOGINID], [BIRTHDATE], [MARITALSTATUS], [GENDER]) VALUES (N'TERRI DUFFY', 2, N'245797967', N'ADVENTURE-WORKS\TERRI0', CAST(N'1971-08-01' AS DATE), N'S', N'F')
GO
INSERT [DBO].[EMPLOYEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [LOGINID], [BIRTHDATE], [MARITALSTATUS], [GENDER]) VALUES (N'ROBERTO TAMBURELLO', 3, N'509647174', N'ADVENTURE-WORKS\ROBERTO0', CAST(N'1974-11-12' AS DATE), N'M', N'M')
GO
INSERT [DBO].[EMPLOYEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [LOGINID], [BIRTHDATE], [MARITALSTATUS], [GENDER]) VALUES (N'ROB WALTERS', 4, N'112457891', N'ADVENTURE-WORKS\ROB0', CAST(N'1974-12-23' AS DATE), N'S', N'M')
GO
INSERT [DBO].[EMPLOYEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [LOGINID], [BIRTHDATE], [MARITALSTATUS], [GENDER]) VALUES (N'GAIL ERICKSON', 5, N'695256908', N'ADVENTURE-WORKS\GAIL0', CAST(N'1952-09-27' AS DATE), N'M', N'F')
GO
INSERT [DBO].[EMPLOYEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [LOGINID], [BIRTHDATE], [MARITALSTATUS], [GENDER]) VALUES (N'JOSSEF GOLDBERG', 6, N'998320692', N'ADVENTURE-WORKS\JOSSEF0', CAST(N'1959-03-11' AS DATE), N'M', N'M')

Next, we’ll do the same thing for the Trainee table:

INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'JOHN WOOD', 18, N'222969461', CAST(N'1978-03-06' AS DATE), N'M')
GO
INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'MARY DEMPSEY', 19, N'52541318', CAST(N'1978-01-29' AS DATE), N'F')
GO
INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'WANIDA BENSHOOF', 20, N'323403273', CAST(N'1975-03-17' AS DATE), N'F')
GO
INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'KEN SÁNCHEZ', 1, N'295847284', CAST(N'1969-01-29' AS DATE), N'M')
GO
INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'TERRI DUFFY', 2, N'245797967', CAST(N'1971-08-01' AS DATE),  N'F')
GO
INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'ROBERTO TAMBURELLO', 3, N'509647174', CAST(N'1974-11-12' AS DATE), N'M')
GO

Now, let’s use INTERSECT to retrieve the list of employees which are common to both tables. To do that, run the following query:

SELECT NAME, 
       BUSINESSENTITYID, 
       NATIONALIDNUMBER, 
       BIRTHDATE, 
       GENDER 
FROM   EMPLOYEE 
INTERSECT 
SELECT NAME, 
       BUSINESSENTITYID, 
       NATIONALIDNUMBER, 
       BIRTHDATE, 
       GENDER 
FROM   TRAINEE

The output of this query should be as follows:

INTERSECT Query 1

As you can see in the screenshot above, the query has only returned records which are common to both tables.

INNER JOIN vs. INTERSECT

In most cases, INTERSECT and INNER JOIN return the same output, but there are some exceptions. A simple example will help us understand this.

Let’s add some duplicate records to the Trainee table. Execute the following query:

INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'TERRI DUFFY', 2, N'245797967', CAST(N'1971-08-01' AS DATE),  N'F')
GO
INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (N'ROBERTO TAMBURELLO', 3, N'509647174', CAST(N'1974-11-12' AS DATE), N'M')
GO

Now, we’ll try to generate the desired output using INTERSECT.

SELECT NAME,BUSINESSENTITYID,NATIONALIDNUMBER,BIRTHDATE,GENDER FROM EMPLOYEE
INTERSECT
SELECT NAME,BUSINESSENTITYID,NATIONALIDNUMBER,BIRTHDATE,GENDER FROM TRAINEE

This is the output we get:

INTERSECT vs INNER JOIN 1

Now, let’s try using INNER JOIN.

SELECT A.NAME, 
       A.BUSINESSENTITYID, 
       A.NATIONALIDNUMBER, 
       A.BIRTHDATE, 
       A.GENDER 
FROM   EMPLOYEE A 
       INNER JOIN TRAINEE B 
               ON A.NAME = B.NAME

The output we get in this case is as follows:

INTERSECT vs INNER JOIN 2

Now, as you can see on the screenshot above, INNER JOIN retrieves records which are common to both tables. It populates all records from the right table. Therefore, you can see duplicate records.

Now, let’s add the DISTINCT keyword to the INNER JOIN query and look at what this does:

SELECT DISTINCT A.NAME, 
                A.BUSINESSENTITYID, 
                A.NATIONALIDNUMBER, 
                A.BIRTHDATE, 
                A.GENDER 
FROM   EMPLOYEE A 
       INNER JOIN TRAINEE B 
               ON A.NAME = B.NAME

The output should look like this:

INTERSECT vs INNER JOIN 3

As you can see on the screenshot above, duplicate records have been eliminated.

INTERSECT and INNER JOIN treat NULL values differently. For INNER JOIN, two NULL values are different, so there are chances that it will skip them while joining two tables.

On the other hand, INTERSECT treats two NULL values as being the same, so records that have NULL values won’t be eliminated. To understand it better, let’s look at an example.

First, let’s add some NULL values to the Trainee and Employee tables by executing the following query:

INSERT [DBO].[TRAINEE] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER], [BIRTHDATE], [GENDER]) VALUES (NULL, 3, N'509647174', CAST(N'1974-11-12' AS DATE), N'M')
GO

INSERT [DBO].[Employee] ([NAME], [BUSINESSENTITYID], [NATIONALIDNUMBER],[LOGINID], [BIRTHDATE],[MARITALSTATUS], [GENDER]) VALUES (NULL, 3, N'509647174','ADVENTURE-WORKS\TERRI0', CAST(N'1974-11-12' AS DATE),  N'M',N'M')
GO

Now let’s try to retrieve records common to the two tables using INTERSECT and INNER JOIN. You will need to execute the following query:

/*QUERY WITH INTERSECT*/ 
SELECT NAME, 
       BUSINESSENTITYID, 
       NATIONALIDNUMBER, 
       BIRTHDATE, 
       GENDER 
FROM   EMPLOYEE 
INTERSECT 
SELECT NAME, 
       BUSINESSENTITYID, 
       NATIONALIDNUMBER, 
       BIRTHDATE, 
       GENDER 
FROM   TRAINEE 

/*QUERY WITH INNER JOIN*/ 
SELECT A.NAME, 
       A.BUSINESSENTITYID, 
       A.NATIONALIDNUMBER, 
       A.BIRTHDATE, 
       A.GENDER 
FROM   EMPLOYEE A 
       INNER JOIN TRAINEE B 
               ON A.NAME = B.NAME

This is the output we should get as a result:

NULL values

As you can see above, the result set generated by INTERSECT contains NULL values, while INNER JOIN skipped the records that have NULL values.

The EXCEPT Operator

To demonstrate the EXCEPT operator in action, let’s look at a use case. For example, I want to populate the details of female employees from the Employee table. The following query will help us do just that:

SELECT NAME, 
       BUSINESSENTITYID, 
       NATIONALIDNUMBER, 
       BIRTHDATE, 
       GENDER 
FROM   EMPLOYEE 
WHERE  GENDER = 'F' 
EXCEPT 
SELECT NAME, 
       BUSINESSENTITYID, 
       NATIONALIDNUMBER, 
       BIRTHDATE, 
       GENDER 
FROM   EMPLOYEE 
WHERE  GENDER = 'M'

This is the output we get:

EXCEPT Operator

As you can see above, the query populated only the female employees’ details.

You can also populate the result set using a sub-query:

SELECT NAME, 
       BUSINESSENTITYID, 
       NATIONALIDNUMBER, 
       BIRTHDATE, 
       GENDER 
FROM   EMPLOYEE AS M 
WHERE  GENDER = 'F' 
       AND GENDER NOT IN (SELECT GENDER 
                          FROM   EMPLOYEE AS F 
                          WHERE  GENDER = 'M')

Limitations of INTERSECT and EXCEPT

  1. We cannot use EXCEPT and INTERSECT in distributed partitioned view definitions with COMPUTE and COMPUTE BY clauses.
  2. EXCEPT and INTERSECT can be used in Fast forward-only and static cursors.
  3. EXCEPT and INTERSECT can be used in distributed queries, but can only be executed on the local server. You cannot run them on a remote server.

Summary

In this article, I have covered:

  1. The EXCEPT and INTERSECT operators.
  2. The difference between INTERSECT and INNER JOIN.
  3. A detailed explanation of the INTERSECT and EXCEPT operators with an example.
Nisarg Upadhyay