One of the available algorithms to join two tables together in SQL Server is Nested Loops. The nested loops join uses one join input as the outer input table and one as the inner input table. The outer loop iterates the outer input table row by row. The inner loop, executed for each outer row, searches for matching rows in the inner input table.

This is called a naive nested loops join.

If you have an index on join conditions in the inner input table, then it is not necessary to perform an inner loop for each row of the outer table. Instead, you can pass the value from the external table as a search argument, and connect all the returned rows of the inner table to the rows from the outer table.

The search by the inner table is a random access. The SQL Server, starting from version 2005, has the batch sort optimization (do not confuse with the Sort operator in Batch Mode for Columnstore indexes). The purpose of the optimization is to order search keys from the external table before getting data from the internal one. Thus, a random access will be sequential.

The execution plan does not display the batch sort operation as a separate operator. Instead, you can see the Optimized=true property in the Nested Loops operator. If it were possible to see batch sort as a separate operator in the plan, it would look like the following:

In this pseudo-plan, we read data from the non-clustered ix_CustomerID index in the order of this CustomerID index key. Then, we need to perform Key Lookup in the clustered index, as ix_CustomerID is not a covering index. Key Lookup is a clustered index key search operation – a random access. To make it sequential, SQL Server may execute batch sort by the clustered index key.

To learn more about batch sort, please refer to my article Batch Sort and Nested Loops.

This optimization provides a great boost with a sufficient number of rows. You can read more about its test results in the blog OPTIMIZED Nested Loops Joins, created by Craig Freedman, an optimizer developer.

However, if the actual number of rows is less than the expected one, then CPU additional costs to build this sort may hide its benefits, increase CPU consumption and reduce its performance.

Consider this particular example:

The returned result:

I would like to draw your attention to the different order of rows in the output. The server returns rows in the order it process them as we have not explicitly specified ORDER BY. In the first case, we gradually read from the ix_c index. However, to optimize random readings from the clustered index, we filter rows by the clustered SalesOrderID index key. In the second case, there is no sort, as well as reads are done in the CustomerID key order of the nonclustered index ix_c.

Difference from the 2340 trace flag

Despite the fact that the documentation specifies the 2340 trace flag as an equivalent of the DISABLE_OPTIMIZED_NESTED_LOOP hint, it is not actually true.

Consider the following example where I will use the UPDATE STATISTICS … WITH PAGECOUNT undocumented command to deceive the optimizer by saying that the table takes more pages than it actually has. Then, have a look at these queries:

  1. Without any hints (I have added MAXDOP to keep a simple plan);
  3. With the 2340 trace flag.
As a result, we have the following plans:

Nested Loops has the Optimized = false property in all three plans. The fact is that by increasing the table width, we also increase the data access cost. When the cost is high enough, SQL Server may use the explicit Sort operator, rather than the implicit batch sort operator. We can see this in the first query plan.

In the second query, we used the DISABLE_OPTIMIZED_NESTED_LOOP hint that turns off the implicit batch sort. However, it removes an explicit sorting by a separate operator.

In the third plan, we can see that despite adding the 2340 trace flag, the sort operator exists.

Thus, the difference of the hint from the flag is as follows: a hint disables the optimization by transforming a random access into a serial one depending on the fact whether the server implements it with the implicit batch sort or with the separate Sort operator.

P.S. Plans may depend on the equipment. Thus, if you fail to run these queries, try to increase or decrease the SomeData char(200) column size in the dbo.SalesOrder table description.


 The article was translated by Codingsight team with the permission of the author.

Dmitry Pilugin

Dmitry Pilugin

Dmitry Pilugin is a SQL Server DB developer from Russia, Moscow. He works with a SQL Server for more than ten years. Most of the time he was involved as a developer of corporate information systems based on SQL Server data platform. Currently he works as a database developer lead, responsible for the development of production databases in the media research company. He is also an occasional speaker at various community events and tech conferences. Favorite topic to present is Query Processor and any things related to it. Dmitry is a Data Platform Microsoft MVP since 2014.
Dmitry Pilugin