Analysis of MS SQL Server for those who see it for the first time – Part 2

Analysis of MS SQL Server for those who see it for the first time – Part 1

We continue to analyze what is happening on our MS SQL Server. In this article, we are going to explore how to retrieve information about user performance: who makes what, and how much resources are consumed.

I think the second part will be interesting for both database administrators and developers who need to understand what is wrong with the requests on the production server that used to work fine on the test server.

I divided tasks of auditing user activity into the following groups:

  1. Tasks of analyzing particular request.
  2. Tasks of analyzing load from the application in specific conditions (for example, when a user clicks on a button in a third-party application compatible with the database).
  3. Tasks of analyzing the current situation.

Let’s consider each of them in detail.

Warning

The performance analysis requires a deep understanding of structure and principles of operation of the database server and operating system. That’s why reading of only these articles will not make you a professional.

The considered criteria and counters in real systems depend on each other greatly. For example, a high HDD load is often caused by the lack of RAM. Even if you conduct some measurements, this is not enough to assess the problems reasonably.

The purpose of the articles is to introduce the essentials on simple examples. You should not consider my recommendations as a guide. I recommend you use them as training tasks that can explain the flow of thoughts.

I hope that you will learn how to rationalize your conclusions about the server performance in figures.

Instead of saying “server slows down”, you will provide specific values of specific indicators.

Analyze a particular request

The first point is quite simple, let’s dwell on it briefly. We will consider some less obvious issues.

In addition to query results, SSMS allows retrieving additional information about the query execution:

  • You can obtain the query plan by clicking the “Display Estimated Execution Plan” and “Include Actual Execution Plan” buttons. The difference between them is that the estimation plan is built without a query execution. Thus, the information about the number of processed rows will be estimated. In the actual plan, there will be both, estimated and actual data. Strong discrepancies of these values indicate that the statistics are not relevant. However, the analysis of the plan is a subject for another article – so far, we will not go deeper.
  • We can get measurements of processor costs and disk operations of the server. To do this, it is necessary to enable the SET option. You can do it either in the ‘Query options’ dialog box

    Or with the direct SET commands in the query. For example,

As a result, we will get data on time spent on compilation and execution, as well as the number of disk operations.

The example of output:

Time of SQL Server parsing and compilation:

CPU time = 16 ms, elapsed time = 89 ms.

 

SQL Server performance time:

CPU time = 0 ms, time spent = 0 ms.

 

SQL Server performance time:

CPU time = 0 ms, time spent = 0 ms.

(32 row(s) affected)

The «ProductProductPhoto» table. The number of views is 32, logic reads – 96, physical reads 5, read-ahead reads 0, lob of logical reads 0, lob of physical reads 0, lob of read-ahead reads 0.

The ‘Product’ table. The number of views is 0, logic reads – 64, physical reads – 0, read-ahead reads – 0, lob of logical reads – 0, lob of physical reads – 0, lob of readahead reads – 0.

The «ProductDocument» table. The number of views is 1, logical reads – 3, physical reads – 1, read-ahead reads -, lob of logical reads – 0, lob of physical reads – 0, lob of readahead reads – 0.

Time of SQL activity:

CPU time = 15 ms, spent time = 35 ms.

I would like to draw your attention to the compilation time, logical reads 96, and physical reads 5. When executing the same query for the second time and later, physical reads may decrease, and the recompilation may not be required. Due to this fact, it often happens that the query is executed faster during the second and subsequent times than for the first time. The reason, as you understand, is in caching data and compiled query plans.

  • The «Include Client Statistics» button displays the information on network exchange, the amount of executed operations and total execution time, including costs on network exchange and processing by a client.The example shows that it takes more time to execute the query for the first time:
  • In SSMS 2016, there is the «Include Live Query Statistics» button. It displays the image as in the case of the query plan but contains the nonrandom digits of the processed rows, which change on the screen while executing the query. The picture is very clear – flashing arrows and running numbers, you can immediately see where the time is wasted. The button was introduced in SSMS 2016, but works for SQL Server 2014 and later.

To sum up:

  • Check CPU costs using SET STATISTICS TIME ON.
  • Disk operations: SET STATISTICS IO ON. Do not forget that logic read is a read operation completed in the disk cache without physically accessing the disk system. “Physical read” takes much more time.
  • Evaluate volume of network traffic using «Include Client Statistics».
  • Analyze the algorithm of executing the query by the execution plan using «Include Actual Execution Plan» and «Include Live Query Statistics».

Analyze the application load

Here we will use SQL Server Profiler. After launching and connecting to the server, it is necessary to select log events. To do this, run profiling with a standard tracing template. On the General tab in the Use the template field, select Standard (default) and click Run.

The more complicated way is to add/drop filters or events to/from the selected template. These options may be found on the second tab of the dialog menu. To see the full range of possible events and columns to select, select the Show All Events and Show All Columns checkboxes.

We will need the following events:

  • Stored Procedures \ RPC:Completed
  • TSQL \ SQL:BatchCompleted

These events monitor all external SQL calls to the server. They appear after completion of the query processing. There are similar events that keep track of the SQL Server start:

  • Stored Procedures \ RPC:Starting
  • TSQL \ SQL:BatchStarting

However, we do not need these procedures as they do not contain information about the server resources spent on the query execution. It is obvious that such an information is available only after completion of the execution process. Thus, columns with data on CPU, Reads, Writes in the *Starting events will be empty.

The following events may interest us as well, however, we will not enable them so far:

  • Stored Procedures \ SP:Starting (*Completed) monitors the internal call to the stored procedure not from the client, but within the current request or other procedure.
  • Stored Procedures \ SP:StmtStarting (*Completed) tracks the start of each statement within the stored procedure. If there is a cycle in the procedure, the number of events for the commands in the cycle will equal the number of iterations in the cycle.
  • TSQL \ SQL:StmtStarting (*Completed) monitors the start of each statement within the SQL-batch. If there are several commands in your query, each of them will contain one event. Thus, it works for the commands located in the query.

These events are convenient for monitoring the execution process. For example, when it is not possible to use a debugger.

By columns

What columns to select is clear from the button name. We will need the following ones:

  • TextData, BinaryData contain the query text.
  • CPU, Reads, Writes, Duration display resource consumption data.
  • StartTime, EndTime is the time to start and finish the execution process. They are convenient for sorting.

Add other columns based on your preferences.

The «Column Filters…» button opens the dialog box for configuring event filters. If you are interested in the activity of the particular user, you can set the filter by the SID number or username. Unfortunately, in the case of connecting the app through the app-server with the pull of connections, monitoring of the particular user becomes more complicated.

You can use filters for se the selection of only complicated queries (Duration>X), queries that cause intensive write (Writes>Y), as well as query content selections, etc.

What else do we need from the profiler? Of course, the execution plan!

It is necessary to add the «Performance \ Showplan XML Statistics Profile» event to the tracing. While executing our query, we will get the following image:

The query text:

The execution plan:

And that is not all

It is possible to save a trace to a file or a database table. Tracing settings can be stored as a personal template for a quick run. You can run the trace without a profiler, by simply using a T-SQL code, and the sp_trace_create, sp_trace_setevent, sp_trace_setstatus, sp_trace_getdata procedures. You can find an example here. This approach can be useful, for example, to automatically start storing a trace to a file on a schedule.  You can have a sneaky peek at profiler to see how to use these commands. You can run two traces and in one of them track what happens when the second one starts. Check that there is no filter by the “ApplicationName” column on the profiler itself.

The list of the events monitored by the profiler is very large and is not limited to receiving query texts. There are events that track fullscan, recompiling, autogrow, deadlock and much more.

Analyzing user activity on the server

There are different situations. A query can hang on ‘execution’ for a long time and it is unclear whether it will be completed or not. I would like to analyze the problematic query separately; however, we must first determine what the query is. It’s useless to catch it with a profiler – we have already missed the starting event, and it’s not clear how long to wait for the process to be completed.

Perhaps, it is not a user query that hangs. Maybe, a server works actively.

Let’s figure it out

You might have heard about ‘Activity Monitor’. Its higher editions have really rich functionality. How can it help us? Activity Monitor includes many useful and interesting features. We will get everything we need from system views and functions. Monitor itself is useful because you can set the profiler on it and see what queries it performs.

We will need:

  • dm_exec_sessions provides information about sessions of connected users. Within our article, the useful fields are those that identify a user (login_name, login_time, host_name, program_name, …) and fields with the information on spent resources (cpu_time, reads, writes, memory_usage, …)
  • dm_exec_requests provides information about queries executed at the moment. There are many fields as well. We will consider some of them:
  • session_id is an identifier of the session to link to the previous view.
  • start_time is the time for the view run.
  • command is a field that contains a type of the executed command. For user queries, it is select/update/delete/
  • sql_handle, statement_start_offset, statement_end_offset provide information for retrieving query text: handle, as well as the start and finish position in the text of the query, which means the part that is currently being executed (for the case when your query contains several commands).
  • plan_handle is a handle of the generated plan.
  • blocking_session_id indicates the number of the session that caused blocking if there are blocks that prevent the execution of the query
  • wait_type, wait_time, wait_resource are fields with the information about the reason and duration of the wait. For some types of waits, for example, data lock, it is necessary to indicate additionally a code for the blocked resource.
  • percent_complete is the percentage of completion. Unfortunately, it is only available for commands with a clearly predictable progress (for example, backup or restore).
  • cpu_time, reads, writes, logical_reads, granted_query_memory are resource costs.
  • dm_exec_sql_text(sql_handle | plan_handle)sys.dm_exec_query_plan(plan_handle) are functions of getting the text and execution plan. Below, we will consider an example of its use.
  • dm_exec_query_stats is a summary statistics of executing queries. It displays the query, the number of its executions, and the volume of spent resources.

Important notes

The above list is just a small part. A complete list of all system views and functions is described in the documentation. Also, there is a beautiful image showing a diagram of links between the main objects.

The query text, its plan and execution statistics are data stored in the procedure cache. They are available during execution. Then, availability is not guaranteed and depends on the cache load. Yes, the cache can be manually cleaned. Sometimes, it is recommended when the execution plans ‘flipped out’. Still, there are a lot of nuances.

The “command” field is meaningless for user requests, as we can get the full text … However, it is not as easy as it sounds. This field is very important for obtaining information about system processes. As a rule, they perform some internal tasks and do not have the sql text. For such processes, the information about the command is the only hint of the activity type.

In the comments to the previous article, there was a question about what the server is involved in, when it should not work. The answer will probably be in the meaning of this field. In my practice, the “command” field always provided something quite understandable for active system processes: autoshrink / autogrow / checkpoint / logwriter / etc.

How to use it

We will go to the practical part. I will provide several examples of its use. Server possibilities are not limited – you can think of your own examples.

Example 1. What process consumes CPU/reads/writes/memory

First, have a look at the sessions that consume more resources, for example, CPU. You may find this information in sys.dm_exec_sessions. However, data on CPU, including reads and writes, is cumulative. It means that the number contains the total for all the time of connection. It is clear that the user who connected a month ago and was not disconnected will have a higher value. It does not mean that they overload the system.

A code with the following algorithm may solve this problem:

  1. Make a selection and store it in a temporary table
  2. Wait for some time
  3. Make a selection for the second time
  4. Compare these results. Their difference will indicate costs spent at step 2.
  5. For convenience, the difference can be divided by the duration of step 2 in order to obtain the average “costs per second”.
I use two tables in the code:  #tmp – for the first selection, and #tmp1 – for the second one. During the first run, the script creates and fills #tmp and #tmp1 at an interval of one second, and then performs other tasks. With the next runs, the script uses the results of the previous execution as a base for comparison. Thus, the duration of step 2 will be equal to the duration of your wait between the script runs.

Try executing it, even on the production server.  The script will create only ‘temporary tables’ (available within the current session and deleted when disabled) and has no thread.

Those who do not like to execute a query in MS SSMS can wrap it in an application written in their favorite programming language. I’ll show you how to do this in MS Excel without a single line of code.

In the Data menu, connect to the server. If you are prompted to select a table, select a random one. Click Next and Finish until you see the Data Import dialog. In that window, you need to click Properties. In Properties, it is necessary to replace a command type with the SQL value and insert our modified query in the Command text field.

You will have to modify the query a little bit:

  • Add «SET NOCOUNT ON»
  • Replace temporary tables with variable tables
  • Delay will last within 1 sec. Fields with averaged values are not required

Modified query for Excel

Result:

When data appears in Excel, you can sort it, as you need. To update the information, click ‘Refresh’. In the workbook settings, you can put “auto-update” in a specified period of time and “update at the start”. You can save the file and pass it to your colleagues. Thus, we created a convenient and simple tool.

Example 2. What does a session spend resources on?

Now, we are going to determine what the problem sessions actually do. To do this, use sys.dm_exec_requests and functions to receive query text and query plan.

The query and execution plan by the session number

Insert the session number into the query and run it. After execution, there will be plans on the Results tab (the first one is for the whole query, and the second one is for the current step if there are several steps in the query) and the query text on the Messages tab. To view the plan, you need to click the text that looks like URL in the row. The plan will be opened in a separate tab. Sometimes, it happens that the plan is opened not in a graphical form, but in the form of XML-text. This may happen because MS SSMS version is lower than the server. Delete the “Version” and “Build” from the first row and then save the result XML to a file with the sqlplan extension. After that, open it separately. If this does not help, I remind you that the 2016 studio is officially available for free on the MS website.

It is obvious that the result plan will be an estimated one, as the query is being executed. Still, it is possible to receive some execution statistics. To do this, use the sys.dm_exec_query_stats view with the filter by our handles.

Add this information at the end of the previous query

As a result, we will get the information about the steps of the executed query: how many times they were executed and what resources were spent. This information is added to the statistics only after the execution process is completed. The statistics are not tied to the user but are maintained within the whole server. If different users execute the same query, the statistics will be total for all users.

Example 3. Can I see all of them?

Let’s combine the system views we considered with the functions in one query. It can be useful for evaluating the whole situation.

The query outputs a list of active sessions and texts of their queries. For system processes, usually, there is no query; however, the command field is filled up. You can see the information about blocks and waits, and mix this query with the example 1 in order to sort by the load. Still, be careful, query texts may be large. Their massive selection can be resource-intensive and lead to huge traffic increase. In the example, I limited the result query to the first 500 characters, but did not execute the plan.

Conclusion

It would be great to get Live Query Statistics for an arbitrary session. According to the manufacturer, now, monitoring statistics requires much resources and therefore, it is disabled by default. Its enabling is not a problem, but additional manipulations complicate the process and reduce the practical benefit.

In this article, we analyzed user activity in the following ways: using possibilities MS SSMS, profiler, direct calls to system views. All these methods allow estimating costs on executing a query and getting the execution plan. Each method is suitable for a particular situation. Thus, the best solution is combining them.

Mikhail Radchenko

Mikhail Radchenko

Mikhail develops software for power engineering. Participated in projects as a developer, architect, and team lead. A fan of MS technologies: MS SQL Server, .NET, C#, WPF, etc.
Mikhail Radchenko

Mikhail Radchenko

Mikhail develops software for power engineering. Participated in projects as a developer, architect, and team lead. A fan of MS technologies: MS SQL Server, .NET, C#, WPF, etc.