Monitoring error - too sensitive?

kevrileykevriley Posts: 129 Gold 1
edited March 8, 2011 8:33AM in SQL Monitor Previous Versions
I know I'm partly to blame for this....http://www.red-gate.com/MessageBoard/viewtopic.php?t=12538

But I seem to getting either lots of false positives, or the alert is too sensitive. There seems to be no configuration on this alert, it is either on or off.

Recently had a large number of these where the time difference between failure and resume was 0 seconds

SQL Server data collection failed at: 7 Mar 2011 2:57:15 AM
SQL Server data collection resumed at: 7 Mar 2011 2:57:15 AM

Comments

  • Hi Kevan,

    Thanks for your post.

    We haven't see this alert to be wrong. If you are quick to click "Show log", you should be able to see the collection which failed. But having said that I agree that may be these alerts are very sensitive at the moment. We will try to improve this. I am tracking this as SRP-3645.

    Thanks,
    Priya
    Priya Sinha
    Project Manager
    Red Gate Software
  • kevrileykevriley Posts: 129 Gold 1
    Thanks Priya.

    The server that seems to suffer the most from this is my hosted server (i.e. not on the internal network), so these could be caused by intermittent issues on our internet line - being able to define a small tolerance (time based) would be great.

    Alternatively, I know I've talked to you guys in the past about having automatically clearing alerts - those kind of alerts that clear themselves once the alert conditions have returned to 'normal'. This would be another one of those cases.

    Kev
  • Thanks Kev. Yes, time based configuration makes sense. I have updated the issue with your suggestion.

    Regards,
    Priya
    Priya Sinha
    Project Manager
    Red Gate Software
  • I'll add my issue to this one as well, even though there is another ticket for it (reference SRP-3585).

    We are constantly seeing the monitoring stopped error and it is always an insufficient privileges error on the HKEY_PERFORMANCE_DATA hive. We have 2 instances that consistently give this error, though all of our instances report it from time to time with no actual outage. One of the trouble instances is an active\passive cluster. The error is never reported for both nodes at the same time and is generally the passive node. For the most part the errors correct themselves and then reoccur.
Sign In or Register to comment.