Lots of Monitoring Errors on a Clustered Service
JohnStaffordDBA
Posts: 5
Hi there,
I seem to get an awful lot of "Monitoring error (host machihne data collection)" alerts from SQL Monitor, relating to our clustered SQL instance. When I look they seem to be in regards to the passive clsuter, so I'm wondering if these are false alarms?
Has anyone else noticed this?
Thanks, John
I seem to get an awful lot of "Monitoring error (host machihne data collection)" alerts from SQL Monitor, relating to our clustered SQL instance. When I look they seem to be in regards to the passive clsuter, so I'm wondering if these are false alarms?
Has anyone else noticed this?
Thanks, John
Comments
When you get this alert and if this alert is still active then go to Configuration > Monitored servers page. If you click on "Show log" for the machine or SQL Server for which this alert was generated then you would be able to see which data collection failed.
Thanks,
Priya
Project Manager
Red Gate Software
Did not occur in that frequency on other sql server (clusters and non clusters). Once in a while I see it as a small glitch in network or server busy doing other things. Nevertheless it seems to be very sensitive error...
restarting WMI service on both nodes seemed to help. Still monitoring though, because I restarted these services this afternoon and will wait what happens tomorrow once it's busy again.
My cluster is Windows 2008 R2 with SQL Server 2005.
http://support.microsoft.com/kb/981314
i updated from 2.1 to 2.2 today and since this time i also get a lot (1-2 a minute!!!) of these "Monitoring error (SQL Server data collection)" Errors!
What's causing this? I simply updated the SQL Monitor Software, nothing else..
Thanks
Kristofer
The monitoring error alerts were added in v2.2 so in previous versions the errors would still likely have been happening but you weren't being informed of the fact.
We're currently investigating improving the alerts in v2.3 to make them less sensitive or maybe allow the user to configure the sensitivity in some manner.
If you wish to know what is causing the errors then you'll need to follow the "Show Log" link for the relevant server on the Monitored Servers page. This log only shows errors for the last 5 minutes so you have to be quick sorry.
If you find the alerts annoying and aren't seeing any benefit then it's possible to disable them in Configuration > Alert Settings.
Hope this helps
Chris
Test Engineer
Red Gate
thanks for your reply..
I disabled the alert for now.
FYI, i get the following errors in the log:
But the error only occurs approx. 1-2 times in a minute. The other times, the Logentries with "GetBinaryValue" have no errors..
Thanks
Kristofer
Have you any other monitoring software that could possibly be attempting to connect to this server? I'm thinking of something like Microsoft's SQLH2 which collects data using remote registry in the same way we do.
Also is it just one specific server or do all your servers have the issue?
Thanks
Chris
Test Engineer
Red Gate
sorry for my late reply. I was ill last week and not in the office..
Yes, we have other monitoring software running. We have Nagios and a DELL Open Manage Software monitoring this server.
I disabled the nagios service for half an hour, but that doesn't helped.
So, maybe it is this DELL Software.. i will speak to our system admin, if he could disable that service for a few minutes.
At the moment we only have one SQL Server that is monitored by SQL Monitor.
Thanks
Kristofer
I think that the chances of it being the DELL software is very low although it's definitely worth trying.
It's probably worth restarting the remote registry service on the server to see if this helps.
We are adjusting these alerts in v2.3 so it might be just a case of disabling these alerts for problematic servers until we can get the new version out.
Regards
Chris
Test Engineer
Red Gate