One of the key skills you need as a database administrator or an IT person is generally the ability to monitor systems very carefully. Lack of this key skill can lead to misdiagnosis when troubleshooting issues. SQL Server exposes a number of tools that can aid the DBA in troubleshooting problems that occur in production. The SQL Server Error Log and the SQL Server Agent Log are two of the most important facilities for troubleshooting SQL Server. In this article, we shall explore the ways we can manipulate the Server and Agent logs.
By default, SQL Server keeps the last six Error Logs and the last nine Agent logs in the instance. By default, a new log file is generated each time the instance of the agent is restarted. The number of log files kept can be modified for the SQL Server logs using the statement of the equivalent GUI below (Fig. 1). To open this GUI in Object Explorer, right-click Management, select SQL Server Error Log and click Configure.
Fig 1. Last Six Error Logs
Each log file is generated upon restart and physically stored in C:\Program Files\ Microsoft SQL Server\MSSQL14.<InstanceID>\MSSQL\Log along with trace files and dump files. SQL Server Error Logs are named ERRORLOG.x (where x is a number) while SQL Agent Logs are named SQLAGENT.x.
During the instance startup, the engine writes key information to the Error log which will include:
SQL Server Version details System Manufacturer
- SQL Server Process ID
- Port Number in use
- Authentication Mode
- Registry Startup Parameters
- Service Account
- Size of CPU and Memory detected
- Status of Key options e.g. Large Pages, Buffer Pool Extension, In-Memory OLTP etc.
This information can be very useful when you just want to quickly find out about any of them from the instance that is introduced to you.
Configuring Error Log Options
As mentioned earlier, certain aspects of the SQL Error Log and Agent log are configurable. In my environment, we set NumErrorLogs to 30 because we want to have the last thirty days of logs preserved on a daily basis. We ensure that each log file contains the data of a day by running sp_cycle_errorlog (and sp_cycle_agent_errorlog) at midnight using an Agent Job. Listing 3 shows the details of this job. These stored procedures can also be quite useful for explicitly cycling the error log when you want to reproduce a problem and watch it unfold (i.e. before escalating to SQL Profiler or Extended Events).
-- Listing 1: Configure NumErrorLogs USE [master] GO EXEC xp_instance_regwrite N'HKEY_LOCAL_MACHINE', N'Software\Microsoft\MSSQLServer\MSSQLServer', N'NumErrorLogs', REG_DWORD, 30 GO
Fig 2. Setting Maximum Number of Error Log Files
The SQL Agent Error Log can be configured to some level of detail by right-clicking SQL Server Agent, and then Error Logs and select Configure from the menu. Listing 2 and Fig. 2 show how to modify the logging level.
-- Listing 2: Setting Error Logging Level USE [msdb] GO EXEC msdb dbo.sp_set_sqlagent_properties @errorlogging_level 7 GO
Fig 3. Setting SQL Agent Error Log Options
-- Listing 3 /* Create Job to Cycle Server and Agent Error Logs */ USE [msdb] GO /****** Object: Job [Cycle Error Logs] Script Date: 01/25/2015 15:40:34 ******/ BEGIN TRANSACTION DECLARE @ReturnCode INT SELECT @ReturnCode = 0 /****** Object: JobCategory [[Uncategorized (Local)]]] Script Date: 01/25/2015 15:40:34 ******/ IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]' AND category_class 1 BEGIN EXEC @ReturnCode = msdb dbo sp_add_category @class N'JOB', @type=N'LOCAL', @name=N'[Uncategorized (Local)]' IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback END DECLARE @jobId BINARY(16 EXEC @ReturnCode = msdb dbo sp_add_job @job_name=N'Custom_Cycle_Error_Logs', @enabled 1 @notify_level_eventlog 0, @notify_level_email 0, @notify_level_netsend 0 @notify_level_page 0 @delete_level 0, @description=N'No description available.', @category_name=N'[Uncategorized (Local)]', @owner_login_name N'COMPANYDOMAIN\kigiri', @job_id = @jobId OUTPUT IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback /****** Object: Step [Step 1] Script Date: 01/25/2015 15:40:35 ******/ EXEC @ReturnCode = msdb dbo sp_add_jobstep @job_id @jobId, @step_name N'Step 1', @step_id 1 @cmdexec_success_code 0 @on_success_action 1 @on_success_step_id 0, @on_fail_action 2 @on_fail_step_id 0, @retry_attempts 0 @retry_interval 0 @os_run_priority 0, @subsystem=N'TSQL', @command=N'EXEC master.sys.sp_cycle_errorlog GO EXEC msdb.dbo.sp_cycle_agent_errorlog GO', @database_name=N'master', @flags 0 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback EXEC @ReturnCode = msdb dbo sp_update_job @job_id = @jobId @start_step_id = 1 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback EXEC @ReturnCode = msdb dbo sp_add_jobschedule @job_id @jobId @name N'Daily Schedule', @enabled 1 @freq_type 4, @freq_interval 1 @freq_subday_type 1 @freq_subday_interval 0 @freq_relative_interval 0 @freq_recurrence_factor 0 @active_start_date 20121208, @active_end_date 99991231 @active_start_time 0 @active_end_time 235959 --@schedule_uid=N'f3eb3e85-9875-4cf5-a789-8c558b772d27' IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback EXEC @ReturnCode = msdb dbo sp_add_jobserver @job_id = @jobId @server_name = N'(local)' IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback COMMIT TRANSACTION GOTO EndSave QuitWithRollback: IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION EndSave: GO /* Change Maximum Number of Error Log Files to 30 */ USE [master] GO EXEC xp_instance_regwrite N'HKEY_LOCAL_MACHINE', N'Software\Microsoft\MSSQLServer\MSSQLServer', N'NumErrorLogs', REG_DWORD, 30 GO /* Set jo history length and agent properties */ EXEC msdb dbo.sp_purge_jobhistory @oldest_date '2015-11-28T19:29:00' GO USE [msdb] GO EXEC msdb dbo.sp_set_sqlagent_properties @jobhistory_max_rows 10000, @jobhistory_max_rows_per_job 1000 GO
“Pouring” Through Logs
Familiarizing oneself with the error log is useful for identifying such things as the items listed in the second section or monitoring such events as the recovery of a large database. But when searching for a specific item such as login failures or similar errors hidden in the sea of logs, you may want to use the filter or search capability.
Fig 4. Filter and Search
The search would be useful when looking for a specific error number or text. The search returns the occurrences of the item required by line at a time. When you need to narrow down the error log using specific criteria, use the filter. You can filter by connection, date or general details.
Fig 5. Search Criterion
Fig 6. Filter Criterion
When the Instance Fails
One good question a keen reader would ask is what happens when SQL Server fails to start? How do we examine the logs? Well, it so happens that besides the fact that the error logs are written to flat files the path of which was earlier specified, you can also view the Server and Agent logs in Windows Event Viewer. All you need to do is filter the Event Viewer Application Logs for events with a source MSSQLSERVER. In more advanced environments, SQL Server logs, like any other Windows Logs, can be stored in a third-party tool like SEIM or Splunk. This makes things so much easier. Starting with SQL Server 2012, it is also possible to view offline error logs from another instance entirely, using registered servers as well as using WMI and WQL.
Fig 7. Viewing SQL Server Logs in Event Viewer
The SQL Server Error Log can be considered the starting point for troubleshooting common errors. Other logs, such as trace files generated by SQL Trace or captured by SQL Profiler or Extended Events Session give far more details on what’s happening inside the engine. There are also specific logs like those related to Database Mail or Replication. While these are all very useful for troubleshooting problems in SQL Server, I would recommend SQL Server Error Logs as a great starting point.