I/O, Input/Output, Reads/Writes what ever you want to call it. It is one of the most aggravating aspects of SQL Server. The physical spinning hard drives of your SAN can bring your entire database server to its knees, no matter how powerful a beast of a server you have.
When you as the DBA are at the mercy of the SAN Administrator (or better yet 3rd party SAN administrator), your hands are somewhat tied on configuration changes. So what do you do? Bang your head against a wall and shout, “It’s not my fault, it’s not my fault!” Then the big burly guys in white shirts will come visit your cubicle, ask you some questions and then take you on a little “vacation”.
That’s not the path I want to go down.
Not having any type of 3rd party monitoring solution, I needed a way to prove which drive was having trouble and just how much work it had to do!
Now I have done my due diligence and have read about I/O stalls, I/O Latency, other I/O measurements and such; but most everything I found, because the way SQL Server works, were cumulative information. Glenn Berry’s DMV Diagnostic Information Queries is great place to learn where to get this information. But again, it is based on SQL Server cumulative information. I hope my Read Latency is not 150+ ms!
I needed a way to say, “for this sample period” this was the read latency of a drive?
Then I found Jon Gurgul b|t, his post titled Delta and Cumulative IO Stats was a great read; however it wasn’t exactly what I needed. But it did get me thinking about his method of capturing two result sets with a time delay and finding the difference. So why couldn’t I do this for Glenn Berry’s Read/Write Latency script.
Below is what I came up with. It probably is not perfect, but it works for me. I have setup a SQL Agent job to run this code every 15 minutes with a 5 min delay between result sets. Then the results are stored in a Diagnostic database so I can chart and report on trends.
I am eager to see the trending over time and truly see what my Latency time for my SAN drives are!
1:
2: IF OBJECT_ID(N'tempdb..#Sample1') IS NOT NULL BEGIN DROP TABLE #Sample1 END;
3: IF OBJECT_ID(N'tempdb..#Sample2') IS NOT NULL BEGIN DROP TABLE #Sample2 END;
4: SELECT
5: LEFT(UPPER(mf.physical_name), 2) AS Drive
6: ,SUM(num_of_reads) AS num_of_reads
7: ,SUM(io_stall_read_ms) AS io_stall_read_ms
8: ,SUM(num_of_writes) AS num_of_writes
9: ,SUM(io_stall_write_ms) AS io_stall_write_ms
10: ,SUM(num_of_bytes_read) AS num_of_bytes_read
11: ,SUM(num_of_bytes_written) AS num_of_bytes_written
12: ,SUM(io_stall) AS io_stall
13: INTO #Sample1
14: FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs
15: INNER JOIN sys.master_files AS mf WITH (NOLOCK)
16: ON vfs.database_id = mf.database_id
17: AND vfs.file_id = mf.file_id
18: GROUP BY LEFT(UPPER(mf.physical_name), 2);
19:
20: WAITFOR DELAY '00:05:00'
21:
22: SELECT
23: LEFT(UPPER(mf.physical_name), 2) AS Drive
24: ,SUM(num_of_reads) AS num_of_reads
25: ,SUM(io_stall_read_ms) AS io_stall_read_ms
26: ,SUM(num_of_writes) AS num_of_writes
27: ,SUM(io_stall_write_ms) AS io_stall_write_ms
28: ,SUM(num_of_bytes_read) AS num_of_bytes_read
29: ,SUM(num_of_bytes_written) AS num_of_bytes_written
30: ,SUM(io_stall) AS io_stall
31: INTO #Sample2
32: FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs
33: INNER JOIN sys.master_files AS mf WITH (NOLOCK)
34: ON vfs.database_id = mf.database_id
35: AND vfs.file_id = mf.file_id
36: GROUP BY LEFT(UPPER(mf.physical_name), 2);
37:
38: INSERT INTO Diagnostic_WH.dbo.[IO-Stats]
39: (Drive
40: ,DriveType
41: ,num_of_reads
42: ,io_stall_reads_ms
43: ,num_bytes_read
44: ,read_latency
45: ,num_of_writes
46: ,io_stall_write_ms
47: ,num_bytes_written
48: ,write_latency
49: ,io_stall
50: ,RowDateTime)
51: SELECT
52: t.Drive
53: ,CASE
54: WHEN t.Drive='C:' THEN 'SSISDB/System'
55: WHEN t.Drive='H:' THEN 'mdf Data'
56: WHEN t.Drive='L:' THEN 'ldf LOG'
57: WHEN t.Drive='E:' THEN 'tempdb'
58: WHEN t.Drive='T:' THEN 'Indexes'
59: END AS [DriveType]
60: ,t.num_of_reads-s.num_of_reads [num_of_reads]
61: ,t.io_stall_read_ms-s.io_stall_read_ms [io_stall_read_ms]
62: ,(t.num_of_bytes_read-s.num_of_bytes_read) [num_bytes_read]
63: ,CASE
64: WHEN (t.num_of_reads-s.num_of_reads)=0 THEN 0
65: ELSE (t.io_stall_read_ms-s.io_stall_read_ms)
66: /(t.num_of_reads-s.num_of_reads)
67: END AS [read_latency]
68: ,t.num_of_writes-s.num_of_writes [num_of_writes]
69: ,t.io_stall_write_ms-s.io_stall_write_ms [io_stall_write_ms]
70: ,t.num_of_bytes_written-s.num_of_bytes_written [num_bytes_written]
71: ,CASE
72: WHEN (t.num_of_writes-s.num_of_writes)=0 THEN 0
73: ELSE (t.io_stall_write_ms-s.io_stall_write_ms)
74: /(t.num_of_writes-s.num_of_writes)
75: END AS [write_latency]
76: ,t.io_stall-s.io_stall [io_stall]
77: ,GETDATE()
78: FROM #Sample1 s
79: INNER JOIN #Sample2 t
80: ON s.Drive=t.Drive
81: ORDER BY t.Drive