Ugly. That’s what unsorted data looks like. We make data easy for the eyes by sorting them. And that’s what SQL ORDER BY is for. Use one or more columns or expressions as a basis to sort data. Then, add ASC or DESC to sort ascending or descending.
The SQL ORDER BY syntax:
ORDER BY <order_by_expression> [ASC | DESC]
The ORDER BY expression can be as simple as a list of columns or expressions. It can also be conditional using a CASE WHEN block.
It’s very flexible.
You can also use paging through OFFSET and FETCH. Specify the number of rows to skip and the rows to show.
But here’s the bad news.
Adding ORDER BY to your queries can slow them down. And some other caveats can make ORDER BY “not working.” You can’t just use them anytime you want, as there might be penalties. So, what do we do?
In this article, we will examine the do’s and don’ts in using ORDER BY. Each item will deal with a problem, and a solution will follow.
Do’s in SQL ORDER BY
1. Index the SQL ORDER BY Column(s)
Indexes are all about quick searches. And having one in the columns you use in the ORDER BY clause can speed up your query.
Let’s start using ORDER BY in a column without an index. We will use the AdventureWorks sample database. Before executing the query below, disable the IX_SalesOrderDetail_ProductID index in the SalesOrderDetail table. Then, press Ctrl-M and run it.
-- Get order details by product and sort them by ProductID USE AdventureWorks GO SET STATISTICS IO ON GO SELECT ProductID ,OrderQty ,UnitPrice ,LineTotal FROM Sales.SalesOrderDetail ORDER BY ProductID SET STATISTICS IO OFF GO
The code above will output the I/O statistics in the Messages tab of SQL Server Management Studio. You will see the execution plan in another tab.
WITHOUT AN INDEX
First, let’s get the logical reads from the STATISTICS IO. Check out Figure 1.
Figure 1. Logical reads using ORDER BY of an unindexed column. (Formatted using statisticsparser.com)
Without the index, the query used 1,313 logical reads. And that WorkTable? It means that SQL Server used TempDB to process the sorting.
But what happened behind the scenes? Let’s inspect the execution plan in Figure 2.
Figure 2. Execution plan of a query using ORDER BY of an unindexed column.
Did you see that Parallelism (Gather Streams) operator? It means SQL Server used more than 1 processor to process this query. The query was heavy enough to require more CPUs.
So, what if SQL Server used TempDB and more processors? It’s bad for a simple query.
WITH AN INDEX
How will it fare if the index is re-enabled? Let’s find out. Rebuild the index IX_SalesOrderDetail_ProductID. Then, re-run the query above.
Check the new logical reads in Figure 3.
Figure 3. New logical reads after rebuilding the index.
This is much better. We cut the number of logical reads by almost half. That means the index made it consume fewer resources. And the WorkTable? It’s gone! No need to use TempDB.
And the execution plan? See Figure 4.
Figure 4. The new execution plan is simpler when the index was rebuilt.
See? The plan is simpler. No need for extra CPUs to sort the same 121,317 rows.
So, the bottom line is: Make sure the columns you use for ORDER BY are indexed.
BUT WHAT IF ADDING AN INDEX IMPACTS WRITE PERFORMANCE?
If that’s the problem, you can dump a portion of the source table to a temporary table or memory-optimized table. Then, index that table. Use the same if more tables are involved. Then, assess the query performance of the option you chose. The faster option will be the winner.
2. Limit the Results with WHERE and OFFSET/FETCH
Let’s use a different query. Let’s say you need to display product information with pictures in an app. Images can make queries even heavier. So, we won’t just check logical reads but also lob logical reads.
Here’s the code.
SET STATISTICS IO ON GO SELECT a.ProductID ,a.Name AS ProductName ,a.ListPrice ,a.Color ,b.Name AS ProductSubcategory ,d.ThumbNailPhoto ,d.LargePhoto FROM Production.Product a INNER JOIN Production.ProductSubcategory b ON a.ProductSubcategoryID = b.ProductSubcategoryID INNER JOIN Production.ProductProductPhoto c ON a.ProductID = c.ProductID INNER JOIN Production.ProductPhoto d ON c.ProductPhotoID = d.ProductPhotoID WHERE b.ProductCategoryID = 1 -- Bikes ORDER BY ProductSubcategory, ProductName, a.Color SET STATISTICS IO OFF GO
This will output 97 bikes with pictures. They are very difficult to browse on a mobile device.
USING MINIMAL WHERE CONDITION WITH NO OFFSET/FETCH
Here’s how much logical reads are needed to retrieve 97 products with pictures. See Figure 5.
Figure 5. The logical reads and lob logical reads when using ORDER BY without OFFSET/FETCH and with minimal WHERE condition. (Note: statisticsparser.com didn’t show the lob logical reads. The screenshot is edited based on the result in SSMS)
667 lob logical reads appeared because of retrieving images in 2 columns. Meanwhile, 590 logical reads were used for the rest.
Here’s the execution plan in Figure 6 so we can compare it later to the better plan.
Figure 6. Execution plan using ORDER BY without OFFSET/FETCH and with minimal WHERE condition.
There’s not much else to say until we see the other execution plan.
USING ADDITIONAL WHERE CONDITION AND OFFSET/FETCH IN ORDER BY
Now, let’s adjust the query to make sure minimal data is returned. Here’s what we are going to do:
- Add a condition on the product subcategory. In the calling app, we can imagine letting the user choose the subcategory too.
- Then, remove the product subcategory in the SELECT list of columns and the ORDER BY list of columns.
- Finally, add OFFSET/FETCH in ORDER BY. Only 10 products will be returned and displayed in the calling app.
Here’s the edited code.
DECLARE @pageNumber TINYINT = 1 DECLARE @noOfRows TINYINT = 10 -- each page will display 10 products at a time SELECT a.ProductID ,a.Name AS ProductName ,a.ListPrice ,a.Color ,d.ThumbNailPhoto FROM Production.Product a INNER JOIN Production.ProductSubcategory b ON a.ProductSubcategoryID = b.ProductSubcategoryID INNER JOIN Production.ProductProductPhoto c ON a.ProductID = c.ProductID INNER JOIN Production.ProductPhoto d ON c.ProductPhotoID = d.ProductPhotoID WHERE b.ProductCategoryID = 1 -- Bikes AND a.ProductSubcategoryID = 2 -- Road Bikes ORDER BY ProductName, a.Color OFFSET (@pageNumber-1)*@noOfRows ROWS FETCH NEXT @noOfRows ROWS ONLY
This code will improve further if you make it into a stored procedure. It will also have parameters like page number and number of rows. The page number indicates what page the user is currently viewing. Improve this further by making the number of rows flexible depending on screen resolution. But that’s another story.
Now, let’s look at the logical reads in Figure 7.
Figure 7. Fewer logical reads after simplifying the query. OFFSET/FETCH is also used in ORDER BY.
Then, compare Figure 7 to Figure 5. The lob logical reads are gone. In addition, the logical reads have a notable decrease as the result set was also decreased from 97 to 10.
But what did SQL Server do behind the scenes? Check out the execution plan in Figure 8.
Figure 8. A simpler execution plan after simplifying the query and adding OFFSET/FETCH in ORDER BY.
Then, compare Figure 8 with Figure 6. Without examining each operator, we can see that this new plan is simpler than the previous one.
The lesson? Simplify your query. Use OFFSET/FETCH whenever possible.
Don’ts in SQL ORDER BY
We’re done with what we need to do when using ORDER BY. This time, let’s focus on what we should avoid.
3. Don’t Use ORDER BY When Sorting by the Clustered Index Key
Because it’s useless.
Let’s show it with an example.
SET STATISTICS IO ON GO -- Using ORDER BY with BusinessEntityID - the primary key SELECT TOP 100 * FROM Person.Person ORDER BY BusinessEntityID; -- Without using ORDER BY at all SELECT TOP 100 * FROM Person.Person; SET STATISTICS IO OFF GO
Then, let’s check the logical reads of both SELECT statements in Figure 9.
Figure 9. 2 queries on the Person table show the same logical reads. One is with ORDER BY, another one without.
Both have 17 logical reads. This is logical because of the same 100 rows returned. But do they have the same plan? Check out Figure 10.
Figure 10. The same plan whether ORDER BY is used or not when sorting by the clustered index key.
Observe the same operators and the same query cost.
But why? When indexing one or more columns into a clustered index, the table will be physically sorted by the clustered index key. So, even if you don’t sort by that key, the result will still be sorted.
Bottom line? Forgive yourself by not using the clustered index key in similar cases using ORDER BY. Save your energy with fewer keystrokes.
4. Don’t Use ORDER BY <string column> When a String Column Contains Numbers
If you sort by a string column containing numbers, don’t expect the sort order like real number types. Otherwise, you’re in for a big surprise.
Here’s an example.
SELECT NationalIDNumber ,JobTitle ,HireDate FROM HumanResources.Employee ORDER BY NationalIDNumber;
Check the output in Figure 11.
Figure 11. Sort order of a string column containing numbers. The numerical value is not followed.
In Figure 11, the lexicographic sort order is followed. So, to fix this, use a CAST to an integer.
SELECT NationalIDNumber ,JobTitle ,HireDate FROM HumanResources.Employee ORDER BY CAST(NationalIDNumber AS INT)
Check out Figure 12 for the fixed output.
Figure 12. CAST to INT fixed the sorting of a string column containing numbers.
So, instead of ORDER BY <string column>, use ORDER BY CAST(<string column> AS INT).
5. Don’t Use SELECT INTO #TempTable with ORDER BY
Your desired sort order will not be guaranteed in the target temporary table. See the official documentation.
Let’s have a modified code from the previous example.
SELECT NationalIDNumber ,JobTitle ,HireDate INTO #temp FROM HumanResources.Employee ORDER BY CAST(NationalIDNumber AS INT); SELECT * FROM #temp;
The only difference from the previous example is the INTO clause. The output will be the same as in Figure 11. We’re back in square 1 even if we CAST the column to INT.
You need to create a temporary table using CREATE TABLE. But include an extra identity column and make it a primary key. Then, INSERT into the temporary table.
Here’s the fixed code.
CREATE TABLE #temp2 ( id INT IDENTITY(1,1) NOT NULL PRIMARY KEY, NationalIDNumber NVARCHAR(15) NOT NULL, JobTitle NVARCHAR(50) NOT NULL, HireDate DATE NOT NULL ) GO INSERT INTO #temp2 (NationalIDNumber, JobTitle, HireDate) SELECT NationalIDNumber ,JobTitle ,HireDate FROM HumanResources.Employee ORDER BY CAST(NationalIDNumber AS INT); SELECT NationalIDNumber ,JobTitle ,HireDate FROM #Temp2;
And the output will be the same as in Figure 12. It works!
Takeaways in Using SQL ORDER BY
We’ve covered the common pitfalls in using SQL ORDER BY. Here’s a summary:
- Index the ORDER BY columns,
- Limit the results with WHERE and OFFSET/FETCH,
- Don’t use ORDER BY when sorting by the clustered index key,
- Don’t use ORDER BY when a string column contains numbers. Instead, CAST the string column to INT first.
- Don’t use SELECT INTO #TempTable with ORDER BY. Instead, create the temporary table first with an extra identity column.
What are your tips and tricks in using ORDER BY? Let us know in the Comments section below. And if you like this post, please share it on your favorite social media platforms.
- Is SQL DISTINCT Good for Removing Duplicates in Results? - September 6, 2021
- SQL BETWEEN-Smart Tips to Scan for a Range of Values - August 13, 2021
- SQL ORDER BY: The 5 Do’s and Don’ts to Sort Data Like a Pro - August 12, 2021