This past Friday evening, a simple data import process (that has been working for several months), decided to not work correctly. Because there was no “early warning” type of notifications systems on the SQL 2005 server, the SQL Agent Job was actually running when Monday morning rolled around.
Using Idera’s SQL Check, I quickly noticed that there was a process that had a lock on a table that was causing some MAJOR problems. Four different SQL processes was waiting on this lock. And of course my supervisor was getting calls left and right from end users because their applications were not working.
I had to figure out what was going on. I was able to get a list of processes and which one was causing all the problem with the sp_who2 command.
sp_who2'active'
However, this returned a “bias” HOST NAME. I am not really sure why. But I confirmed with networking that the returned hostname and IP address do not exist on our network, but that is for another day.
I needed to know an accurate way to get all the information I needed at that moment when a job wrongs long. This code is probably providing way more information than I really need, but sometimes more is better. Even though the “host_name” is returning a “bogus” host name; the client_net_address is not! This valid IP address is very important to locating the culprit machine.
SELECT
C.session_id
,C.client_net_address
,S.host_name
,S.Client_version
,S.client_interface_name
,S.login_name
,S.original_login_name
,db_name(R.database_id) AS [DB Name]
,R.status
,R.command
,R.blocking_session_id
,R.wait_type
,R.percent_complete
,wait_resource
,C.net_transport
,C.protocol_type
,C.auth_scheme
,C.last_read
,C.last_write
,S.cpu_time
,S.memory_usage
,(S.total_elapsed_time/1000) AS [Total Sec]
,S.lock_timeout
FROM sys.dm_exec_connections AS C
JOIN sys.dm_exec_requests AS R
ON C.session_id=R.session_id
JOIN sys.dm_exec_sessions AS S
ON C.session_id=S.session_id
ORDER BY session_id
Using Thomas LaRock’s HOW TO: Find Currently Running Long SQL Agent Jobs, I setup up a simple 2 step SQL Agent job to check for 1) long running jobs and 2) session information (using the code above). A simple “save results to txt file” allows me to export the results to and study them later.
USE [msdb]
GO
/****** Object: Job [job_WH_MonitorDatabase] Script Date: 10/30/2012 13:57:18 ******/
BEGIN TRANSACTION
DECLARE @ReturnCode INT
SELECT @ReturnCode = 0
/****** Object: JobCategory [xx] Script Date: 10/30/2012 13:57:18 ******/
IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'xxx' AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'xxx'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
END
DECLARE @jobId BINARY(16)
EXEC @ReturnCode = msdb.dbo.sp_add_job @job_name=N'job_MonitorDatabase',
@enabled=1,
@notify_level_eventlog=0,
@notify_level_email=2,
@notify_level_netsend=0,
@notify_level_page=0,
@delete_level=0,
@description=N'’,
@category_name=N'xx',
@owner_login_name=N'INFOSYS\xx',
@notify_email_operator_name=N'xx', @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
/****** Object: Step [Monitor] Script Date: 10/30/2012 13:57:18 ******/
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Monitor',
@step_id=1,
@cmdexec_success_code=0,
@on_success_action=3,
@on_success_step_id=0,
@on_fail_action=2,
@on_fail_step_id=0,
@retry_attempts=0,
@retry_interval=0,
@os_run_priority=0, @subsystem=N'TSQL',
@command=N'exec usp_LongRunningJobs',
@database_name=N'master',
@flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
/****** Object: Step [RecordProcesses] Script Date: 10/30/2012 13:57:18 ******/
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'RecordProcesses',
@step_id=2,
@cmdexec_success_code=0,
@on_success_action=1,
@on_success_step_id=0,
@on_fail_action=2,
@on_fail_step_id=0,
@retry_attempts=0,
@retry_interval=0,
@os_run_priority=0, @subsystem=N'TSQL',
@command=N'select
sysp.spid AS [SPID]
,sysp.status AS [Status]
,sysp.hostname AS [Host Name]
,cn.client_net_address AS [Client IP]
,db_name(sysp.dbid) AS [DB Name]
,convert(sysname, rtrim(sysp.loginame)) as [Login]
,sysp.program_name AS [Program]
,sysp.cmd AS [CMD Status]
,sysp.cpu AS [CPU usage]
,sysp.physical_io [IO]
,sysp.blocked AS [Blocked By]
,sysp.waitresource AS [Wait Resource]
from master.dbo.sysprocesses AS sysp
INNER JOIN sys.dm_exec_connections AS cn
ON sysp.spid=cn.session_ID
WHERE sysp.dbid > 4 AND sysp.cmd <> ''AWAITING COMMAND''
ORDER BY sysp.hostname',
@database_name=N'master',
@output_file_name=N'F:\SQLAgent_Logs\RunningProcesses.txt',
@flags=2
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N'Daily',
@enabled=1,
@freq_type=4,
@freq_interval=1,
@freq_subday_type=4,
@freq_subday_interval=10,
@freq_relative_interval=0,
@freq_recurrence_factor=0,
@active_start_date=20121030,
@active_end_date=99991231,
@active_start_time=0,
@active_end_time=235959
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:
Thanks for your time! I will let you know how it turns out.