Competition: What’s your favorite Redgate tool? Enter now.

ms/s Waits or sec/s Waits?

ClearViewClearView DFWPosts: 12 Bronze 1
Hi All,
On the initial "Overview" page, I see a "tile" for each monitored instance.  In the lower left of each "Tile" there is a small grey ?chip? which is labeled "Waits" and contains entries such as 206ms/s or 2s/s.
How are these values to be interpreted?
What does it mean if a 4 core VM routinely shows 400ms/s or a 12 core physical server is 2s/s? 
When there are multiple instances how do the values relate?

Tagged:

Answers

  • The tile shows average /s waits over the last two minutes. So at a basic level if it’s 0 20 40 75, each at 2 mins, latest will be 35. Hovering over shows you the last 15 minutes.

    https://docs.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-os-wait-stats-transact-sql?view=sql-server-2017

    In the case of multiple cores, each core can record its own stats, so it can add up to more than the timeframe indicates it should allow: https://stackoverflow.com/questions/30389242/microsoft-sql-sys-dm-os-wait-stats-add-up-to-more-milliseconds-than-in-the-time.

    It's just meant to be an indicator that something might be wrong; you'd then be expected to dig into it further and find out. If you start with "lower is better" and go from there it would cover most basic problems.
    Have you visited our Help Centre?
  • ClearViewClearView DFWPosts: 12 Bronze 1
    Hi Russell,
    I appreciate the reply and the information.  However, the reply doesn't really answer my question.  I understand that the value is intended to be an indicator but I don't understand how to interpret the values.  If I tell you the temperature is 25 and you should dress accordingly then what are you going to wear?
    I have a feel for what the raw Wait time statistics mean and some idea of how to monitor and respond.  Of course I want the numerator to be "small" but when does it stop being "small". I just don't have a feel for this s/s measure.  What is a threshold? How do I determine when it indicates a problem?
  • I would interpret your telling me that the temperature is 25 degrees as time to wear shorts and a tshirt but have a jumper handy because the aircon in our office is always too cold :).

    What I mean of course is that only you, or someone experienced with your system can really tell you what threshold represents a problem for your system. If wait times of a certain threshold are causing you problems then set the threshold slightly lower than that, and experiment. 

    On one level you could simply ignore the information, for example "here is a 4 core server with 2s/s of waits". Well - so what? But on another level you're supposed to look at the indicator and think "thats too much waiting for that instance" or "that seems ok" based on your experience and knowledge of your environment.

    Other that if you're not an experienced DBA I'm afraid a lot of research is involved - https://www.brentozar.com/sql/wait-stats/ makes for great reading on this, and the Microsoft documentation is pretty useful as well: https://docs.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-os-wait-stats-transact-sql?view=sql-server-2017.

    There is also help provided in the tool, which might just mean the discoverability of it needs work if you haven't yet found it. For example if you click on a server and go through to the Top Waits table:

    it contains information on the wait and gives information on how to troubleshoot it. Yes this isn't that helpful strictly speaking for the Global overview, but there really is only so much you can show on a top level dashboard.

    This is all a very long winded way of saying I can't really give you a perfect answer unfortunately; the simple answer is "lower is better" but this doesn't take into consideration the myriad complications that arise from running SQL Server on any number of different environments.
    Have you visited our Help Centre?
  • ClearViewClearView DFWPosts: 12 Bronze 1
    I will be out for a couple of weeks but I am still working on understanding this metric.  I am sure there must be a way to make the numbers more meaningful.  When you hover over the Wait chip it shows Signal Wait vs Resource Wait which is definitely an immediately meaningful number.  I think High Signal to Resource clearly indicates a CPU issue.
    Perhaps Brent Ozar's Waits/Sec/Core would be more useful as the summary value shown on the Chip
Sign In or Register to comment.