Twice in the last six months, I have had to get involved in the resolution of an incident associated with active database tables growing to the extent that they caused issues. Such incidents are easy to avoid by designing a mechanism to manage the growth of the core table by moving data periodically from this core table to a history table.
It can be achieved using SQL Agent jobs that can keep the core table small. With scheduled index maintenance and statistics updates, performance will remain top-notch for live transactions hitting this core table.
One consequence of this approach is that business data is now distributed across two or more tables. If the application is not designed to cater to this ab initio, the data experts would need to come up with a reporting approach that combines data from both tables to give a complete picture of the transaction data that has been introduced to the database over time.
One way is to use JOINs based on a key column, another is to use a UNION, assuming a concatenation of data from both tables is desirable.
- Sample data
- The UNION operator
- The UNION ALL operator
- Key differences between UNION and UNION ALL in SQL
- UNION vs UNION ALL
- UNION and UNION ALL in execution plans
- UNION vs UNION ALL in different SQL engines
- JOIN vs UNION: When to use each
- Best practices and tips for using UNION and UNION ALL in SQL
- Conclusion
- FAQ
Sample data
To show how this might be useful in a database, we have created tables – core and history tables – to represent the scenario described in the previous paragraph (see listings 1 and 2).
TranHistory is an exact copy of Tran with an extra column to capture the timestamp when rows are inserted into the history table.
-- Listing 1: Structure of the Tran Table
USE [AU]
GO
CREATE TABLE [dbo].[Tran](
[responseId] [int] IDENTITY(1,1) NOT NULL,
[senderId] [varchar](15) NULL,
[msisdn] [varchar](15) NULL,
[message] [varchar](160) NULL,
[status] [smallint] NULL,
[application] [varchar](1) NULL,
[receivedTime] [datetime2](7) NULL,
[processedTime] [datetime2](7) NULL,
[flag] [smallint] NULL,
[requestDelivery] [smallint] NULL,
[delivered] [smallint] NULL,
[account] [varchar](20) NULL,
[srcTon] [smallint] NULL,
[srcNpi] [smallint] NULL,
[destTon] [smallint] NULL,
[destNpi] [smallint] NULL,
[errorCode] [smallint] NULL,
[messageId] [varchar](1) NULL,
[sequenceNo] [int] NULL,
[retries] [smallint] NULL,
[messagePriority] [int] NULL,
[userId] [varchar](20) NULL,
[bulkId] [varchar](20) NULL
) ON [PRIMARY]
GO
-- Listing 2: Structure of the Tran Table
USE [AU]
GO
CREATE TABLE [dbo].[TranHistory](
[responseId] [int] NOT NULL,
[senderId] [varchar](15) NULL,
[msisdn] [varchar](15) NULL,
[message] [varchar](160) NULL,
[status] [smallint] NULL,
[application] [varchar](1) NULL,
[receivedTime] [datetime2](7) NULL,
[processedTime] [datetime2](7) NULL,
[flag] [smallint] NULL,
[requestDelivery] [smallint] NULL,
[delivered] [smallint] NULL,
[account] [varchar](20) NULL,
[srcTon] [smallint] NULL,
[srcNpi] [smallint] NULL,
[destTon] [smallint] NULL,
[destNpi] [smallint] NULL,
[errorCode] [smallint] NULL,
[messageId] [varchar](1) NULL,
[sequenceNo] [int] NULL,
[retries] [smallint] NULL,
[messagePriority] [int] NULL,
[userId] [varchar](20) NULL,
[bulkId] [varchar](20) NULL,
[archivedTime] [datetime2](7) NOT NULL
) ON [PRIMARY]
GO
Both tables have been populated with data as shown in Figure 1. The Tran table has two rows, and the TranHistory table has four rows.
Assuming the four rows in the TranHistory table are the result of an archival process moving data from the core Tran table to the history table, a query to discover how many transactions have hit the database since inception would require combining both tables. The total number of records across both tables is six.

The UNION operator
A UNION in SQL is an operator that combines data from two or more result sets. Using the UNION operator requires that the result sets involved have the same number of columns. The corresponding columns must also be of the same data type and be retrieved in the same order. Listing 1 shows the structure of an SQL statement with a UNION operator.
-- Listing 3: A UNION Statement
USE [AU]
GO
SELECT [responseId]
,[senderId]
,[msisdn]
,[message]
,[status]
,[application]
,[receivedTime]
,[processedTime]
,[flag]
,[requestDelivery]
,[delivered]
,[srcTon]
,[srcNpi]
,[destTon]
,[destNpi]
,[errorCode]
,[messageId]
,[sequenceNo]
,[retries]
,[messagePriority]
,[bulkId]
FROM [dbo].[Tran]
UNION
SELECT [responseId]
,[senderId]
,[msisdn]
,[message]
,[status]
,[application]
,[receivedTime]
,[processedTime]
,[flag]
,[requestDelivery]
,[delivered]
,[srcTon]
,[srcNpi]
,[destTon]
,[destNpi]
,[errorCode]
,[messageId]
,[sequenceNo]
,[retries]
,[messagePriority]
,[bulkId]
FROM [dbo].[TranHistory]
GO
The UNION operator is designed to return a combined result set from two discrete data sets, such as the two tables we described earlier – Tran and TranHistory. The result is a combination of Tran and TranHistory without duplicates.
To achieve the “without” duplicates part, it is necessary for SQL Server to sort the intermediate results as shown by the Live Query Statistics captured in SQL Server Management Studio (See Fig. 2).

The UNION statement returns four rows as shown in Figure 3.

The UNION ALL operator
The UNION ALL statement differs from the UNION in that the command simply combines the data from two or more tables without removing duplicates. This implies there is no need for sorting, which is an expensive task in the SQL world. This also means the result set is likely to be returned faster because the overhead of sorting is excluded.
It is advisable to use UNION ALL when there is no requirement to sort the results of such a combination of two or more tables or to remove duplicates. By design, there may also be a certainty that there are no duplicates across the two tables, so in such a case, it would be unnecessary to use a UNION.
Listing 4 shows the UNION ALL statement. Notice it is the same as the command “UNION ALL” and the exclusion of the archivedTime column in the TranHistory table. This ensures the sets being concatenated have the same set of columns.
-- Listing 4: A UNION Statement
USE [AU]
GO
SELECT [responseId]
,[senderId]
,[msisdn]
,[message]
,[status]
,[application]
,[receivedTime]
,[processedTime]
,[flag]
,[requestDelivery]
,[delivered]
,[srcTon]
,[srcNpi]
,[destTon]
,[destNpi]
,[errorCode]
,[messageId]
,[sequenceNo]
,[retries]
,[messagePriority]
,[bulkId]
FROM [dbo].[Tran]
UNION ALL
SELECT [responseId]
,[senderId]
,[msisdn]
,[message]
,[status]
,[application]
,[receivedTime]
,[processedTime]
,[flag]
,[requestDelivery]
,[delivered]
,[srcTon]
,[srcNpi]
,[destTon]
,[destNpi]
,[errorCode]
,[messageId]
,[sequenceNo]
,[retries]
,[messagePriority]
,[bulkId]
FROM [dbo].[TranHistory]
GO
Figure 4 shows the result of the UNION ALL statement. We see six rows returned instead of four returned earlier by the UNION command.

Comparing the Live Query Statistics, we see that the sorting is excluded.

Key differences between UNION and UNION ALL in SQL
SQL Server provides powerful tools for managing and querying databases. When working with multiple tables or queries that share similar structures, combining their result sets can be essential. Two primary methods for combining these results are UNION and UNION ALL. This article will explore the key differences between UNION and UNION ALL in SQL Server to help you make informed decisions when managing your data.
- Handling duplicate rows:
UNION:- The UNION operator combines the result sets of two or more SELECT statements and eliminates duplicate rows in the final result set. It implicitly uses the DISTINCT keyword, ensuring that each row in the result set is unique. Useful when you need a combined result set without any duplicate records.
- The UNION ALL operator concatenates the result sets of two or more SELECT statements without eliminating duplicate rows.
- It preserves all records from both result sets, including duplicates.
- Ideal for scenarios where retaining duplicate records is important or when you want to optimize query performance.
- Performance implications:
UNION:- Since UNION removes duplicate rows, it requires additional processing. The performance may be slower, especially when dealing with large data sets, as SQL Server needs to sort and compare records to identify duplicates.
- UNION ALL generally offers better performance, as it does not require the removal of duplicate rows.
- It simply combines the result sets without any additional processing, making it faster than UNION in most cases.
- Syntax and compatibility:
- Both UNION and UNION ALL require that the SELECT statements being combined have the same number of columns, and that the corresponding columns share compatible data types. Although the column names don’t have to match, it’s a good practice to use aliases for clarity and consistency.
- Choosing the appropriate method:
- Assess the requirements of your specific use case to determine whether duplicates need to be eliminated or retained.
- Consider the performance impact of each method, especially when dealing with large datasets. If removing duplicates is not critical, UNION ALL may be a more efficient choice.
UNION vs UNION ALL
While both operators serve to combine result sets, they behave differently in subtle but important ways. A quick comparison makes it easier to decide which one to use in practice:
| Feature | UNION | UNION ALL |
|---|---|---|
| Removes duplicates | Yes | No |
| Performance | Slower (sorting and deduplication) | Faster (no sorting required) |
| Use case | When duplicates must be removed | When speed matters more than uniqueness |
| Result set | Distinct rows only | All rows combined, including duplicates |
As you can see, UNION ensures uniqueness but at the cost of performance, while UNION ALL is faster but preserves duplicates. Choosing the right operator depends on whether the accuracy of distinct records or query efficiency is your priority.
UNION and UNION ALL in execution plans
Beyond the difference in results, UNION and UNION ALL also leave distinct signatures in SQL Server’s execution plans. Looking at them side by side makes the differences clearer.
When you run a UNION, SQL Server inserts a Sort operation into the plan to eliminate duplicates. This extra step can become costly as data volumes grow, since every row must be sorted and compared before producing the final output. You’ll often notice that the Sort operator carries a significant percentage of the total query cost in SSMS.
With UNION ALL, the picture is much simpler. The query optimizer doesn’t need to sort or perform duplicate checks; it simply concatenates the results. The execution plan typically shows only a Concatenation operator, and the overall cost is noticeably lower compared to UNION.
Including execution plan analysis in your workflow helps you make informed decisions not just about the correctness of results, but also about efficiency.
UNION vs UNION ALL in different SQL engines
While UNION and UNION ALL follow the same SQL standard, the way they’re executed can vary a bit depending on the database engine. If you’re switching between systems or migrating queries, it’s helpful to know what happens under the hood.
SQL Server
In SQL Server, the difference between the two operators shows up clearly in the execution plan. A UNION always comes with a Sort step to weed out duplicates, and on large datasets, that sort can dominate the query cost. By contrast, UNION ALL keeps things light: it just concatenates the result sets without any extra checks. Performance tuning here often comes down to making sure indexes are in place so that the sorting step in UNION doesn’t become a bottleneck.
PostgreSQL
PostgreSQL handles things a little differently. When you run a UNION, it may not always sort. Sometimes it will use a HashAggregate strategy to identify duplicates, depending on what it thinks will be faster. UNION ALL, as expected, skips all of that and simply returns everything. One neat aspect of PostgreSQL is that its planner is good at pushing filters into the individual queries before combining them, which makes UNION ALL efficient.
MySQL
MySQL takes a more straightforward approach. Any plain UNION you write is automatically treated as if you had asked for DISTINCT, so duplicates are removed whether you intended it or not. That’s why you need to be explicit with UNION ALL if you actually want to keep every row. Behind the scenes, MySQL often uses a temporary table to do the deduplication, which can slow things down if you’re working with lots of data. With UNION ALL, that extra step disappears, and queries usually run much faster.
So, generally, SQL Server leans on sorting, PostgreSQL sometimes hashes, and MySQL relies on temporary tables. Knowing these differences can save you headaches when optimizing queries across different platforms.
JOIN vs UNION: When to use each
JOINs and UNIONs are also often mentioned together, but they answer very different questions. A JOIN links related data side by side, while a UNION stacks similar data on top of each other.
In practice:
- Use a JOIN when you want to see how two tables relate, for example, pairing each transaction with customer details or product information. The goal is to enrich your view of each row with additional context.
- Use a UNION when you want to treat two sets of records as one, for example, combining live and archived transactions into a single timeline. The goal here is not to add context but to extend the dataset.
A simple way to think about it:
- JOINs make your dataset wider by adding columns.
- UNIONs make your dataset longer by adding rows.
Both approaches are essential, but choosing the right one depends on whether you’re looking to connect related entities or consolidate similar records.
Best practices and tips for using UNION and UNION ALL in SQL
To make the most of UNION and UNION ALL operators in SQL, it’s essential to follow best practices and be aware of common pitfalls. This article provides examples and explanations of best practices and tips for using UNION and UNION ALL.
Ensuring column compatibility
Matching column numbers and data types:
Combining results from Employees and Contractors tables
SELECT EmployeeID, FirstName, LastName, HireDate
FROM Employees
UNION
SELECT ContractorID, FirstName, LastName, StartDate
FROM Contractors;
Ensure that the SELECT statements being combined have the same number of columns, and that corresponding columns share compatible data types. In the example above, both queries have four columns, and the data types of the corresponding columns match.
Using aliases and data type conversions when necessary
Combining results from the Employees and Contractors tables with differing column names and data types.
SELECT EmployeeID AS ID, FirstName, LastName, CAST(HireDate AS DATE) AS StartDate
FROM Employees
UNION
SELECT ContractorID AS ID, FName AS FirstName, LName AS LastName, StartDate
FROM Contractors;
If the column names or data types don’t match, use aliases and data type conversions to ensure compatibility. In this example, we use aliases to standardize column names (FName and LName) and the CAST function to convert HireDate to the same data type as StartDate.
Improving performance
Opting for UNION ALL when duplicates are not a concern:
Combining results from two SELECT statements without eliminating duplicates
SELECT ProductID, ProductName
FROM Products_A
UNION ALL
SELECT ProductID, ProductName
FROM Products_B;
When duplicate rows are not a concern, or you want to optimize query performance, use UNION ALL instead of UNION. In this example, we combine results from the Products_A and Products_B tables without removing duplicates, providing faster query execution.
Limiting result sets using WHERE and LIMIT clauses:
Combining top 10 results from two SELECT statements filtered by a specific condition
SELECT TOP 10 ProductID, ProductName
FROM Products
WHERE CategoryID = 1
UNION
SELECT TOP 10 ProductID, ProductName
FROM Products
WHERE CategoryID = 2;
Use WHERE and LIMIT (or TOP in SQL Server) clauses to limit the result sets being combined, reducing the amount of data processed by UNION or UNION ALL. In this example, we combine the top 10 results from two categories, reducing the size of the result set and improving performance.
Quick checklist for using UNION and UNION ALL
- Ensure both queries return the same number of columns.
Check that data types are compatible across corresponding columns.
Use aliases and type conversions (CAST/CONVERT) when column names or types differ. - Choose UNION ALL when duplicates aren’t a concern and performance matters.
- Apply filters (WHERE) and row limits (TOP, LIMIT) early to reduce data volume before combining.

Common pitfalls and how to avoid them
Mismatched columns:
Pitfall:
Attempting to combine SELECT statements with different numbers of columns or incompatible data types.
How to Avoid:
Verify that the SELECT statements being combined have the same number of columns and compatible data types. Use aliases or data type conversion functions, such as CAST or CONVERT, when necessary.
Incorrect use of UNION or UNION ALL:
Pitfall:
Using UNION when duplicates should be retained or using UNION ALL when duplicates need to be eliminated.
How to Avoid:
Assess the requirements of your specific use case to determine if duplicates should be eliminated or retained. Choose UNION or UNION ALL accordingly.
Conclusion
The UNION and UNION ALL operators are essential tools for combining multiple result sets into a single output. Choosing the right operator depends on your goals: whether you need to remove duplicates or prioritize performance. This article highlighted the conditions necessary for meaningful results and best practices to follow.
In general, UNION ALL is preferred when sorting and uniqueness aren’t required, as it avoids the overhead of deduplication and usually performs faster. Meanwhile, JOINs remain a powerful alternative for combining data based on shared columns, offering a more sophisticated way to enrich your results.
Quick decision guide
| When to use | Recommendation |
|---|---|
| Duplicate removal is essential | Use UNION |
| Performance matters and duplicates aren’t an issue | Use UNION ALL |
Using this guide can help you quickly decide which operator fits your scenario, ensuring both correctness and efficiency in your SQL queries.
SQL coding in SSMS is much faster with such a helpful add-in as SQL Complete.

FAQ
What is the main difference between UNION and UNION ALL in SQL?
The main difference lies in how duplicates are handled. UNION removes duplicate rows from the combined result set, ensuring each row is unique. UNION ALL, on the other hand, preserves all rows, including duplicates, which can make it faster and more efficient for large datasets where uniqueness isn’t required.
Which is faster: UNION or UNION ALL?
UNION ALL is generally faster because it avoids the overhead of sorting and comparing rows to remove duplicates. UNION may be slower, especially with large datasets, since SQL Server (or other engines) needs to perform extra work to ensure all results are distinct.
Can UNION and UNION ALL be used with different data types?
No, both operators require that corresponding columns in each SELECT statement have compatible data types. If the data types differ, you’ll need to use type conversions, such as CAST or CONVERT, to make them compatible before combining the results.



