Monitoring error - too sensitive?
kevriley
Posts: 129 Gold 1
I know I'm partly to blame for this....http://www.red-gate.com/MessageBoard/viewtopic.php?t=12538
But I seem to getting either lots of false positives, or the alert is too sensitive. There seems to be no configuration on this alert, it is either on or off.
Recently had a large number of these where the time difference between failure and resume was 0 seconds
SQL Server data collection failed at: 7 Mar 2011 2:57:15 AM
SQL Server data collection resumed at: 7 Mar 2011 2:57:15 AM
But I seem to getting either lots of false positives, or the alert is too sensitive. There seems to be no configuration on this alert, it is either on or off.
Recently had a large number of these where the time difference between failure and resume was 0 seconds
SQL Server data collection failed at: 7 Mar 2011 2:57:15 AM
SQL Server data collection resumed at: 7 Mar 2011 2:57:15 AM
Comments
Thanks for your post.
We haven't see this alert to be wrong. If you are quick to click "Show log", you should be able to see the collection which failed. But having said that I agree that may be these alerts are very sensitive at the moment. We will try to improve this. I am tracking this as SRP-3645.
Thanks,
Priya
Project Manager
Red Gate Software
The server that seems to suffer the most from this is my hosted server (i.e. not on the internal network), so these could be caused by intermittent issues on our internet line - being able to define a small tolerance (time based) would be great.
Alternatively, I know I've talked to you guys in the past about having automatically clearing alerts - those kind of alerts that clear themselves once the alert conditions have returned to 'normal'. This would be another one of those cases.
Kev
Riley Waterhouse Limited
Twitter: @kevriley
Regards,
Priya
Project Manager
Red Gate Software
We are constantly seeing the monitoring stopped error and it is always an insufficient privileges error on the HKEY_PERFORMANCE_DATA hive. We have 2 instances that consistently give this error, though all of our instances report it from time to time with no actual outage. One of the trouble instances is an active\passive cluster. The error is never reported for both nodes at the same time and is generally the passive node. For the most part the errors correct themselves and then reoccur.