Lots of Monitoring Errors on a Clustered Service

JohnStaffordDBAJohnStaffordDBA Posts: 5
edited March 28, 2011 11:05AM in SQL Monitor Previous Versions
Hi there,
I seem to get an awful lot of "Monitoring error (host machihne data collection)" alerts from SQL Monitor, relating to our clustered SQL instance. When I look they seem to be in regards to the passive clsuter, so I'm wondering if these are false alarms?
Has anyone else noticed this?
Thanks, John

Comments

  • Hi John,

    When you get this alert and if this alert is still active then go to Configuration > Monitored servers page. If you click on "Show log" for the machine or SQL Server for which this alert was generated then you would be able to see which data collection failed.

    Thanks,
    Priya
    Priya Sinha
    Project Manager
    Red Gate Software
  • oderksoderks Posts: 67 Bronze 2
    I raised a similar issue with support (although it was targeted the active node) with regards to a Windows 2008 R2 cluster running SQL 2008 R2.

    Did not occur in that frequency on other sql server (clusters and non clusters). Once in a while I see it as a small glitch in network or server busy doing other things. Nevertheless it seems to be very sensitive error...

    restarting WMI service on both nodes seemed to help. Still monitoring though, because I restarted these services this afternoon and will wait what happens tomorrow once it's busy again.
  • I see this as well and reported it earlier, (reference SRP-3585).
    My cluster is Windows 2008 R2 with SQL Server 2005.
  • I has this problem on our Windows 2008 R2 instances. This patch solved it...

    http://support.microsoft.com/kb/981314
  • We saw this and it has not helped us.
  • KristoferWKristoferW Posts: 10 Bronze 2
    Hi,

    i updated from 2.1 to 2.2 today and since this time i also get a lot (1-2 a minute!!!) of these "Monitoring error (SQL Server data collection)" Errors!

    What's causing this? I simply updated the SQL Monitor Software, nothing else..

    Thanks
    Kristofer
  • KristoferW wrote:
    What's causing this? I simply updated the SQL Monitor Software, nothing else..

    The monitoring error alerts were added in v2.2 so in previous versions the errors would still likely have been happening but you weren't being informed of the fact.

    We're currently investigating improving the alerts in v2.3 to make them less sensitive or maybe allow the user to configure the sensitivity in some manner.

    If you wish to know what is causing the errors then you'll need to follow the "Show Log" link for the relevant server on the Monitored Servers page. This log only shows errors for the last 5 minutes so you have to be quick sorry.

    If you find the alerts annoying and aren't seeing any benefit then it's possible to disable them in Configuration > Alert Settings.

    Hope this helps
    Chris
    Chris Spencer
    Test Engineer
    Red Gate
  • KristoferWKristoferW Posts: 10 Bronze 2
    Hi Chris,

    thanks for your reply..
    I disabled the alert for now.

    FYI, i get the following errors in the log:
    Mar 2011 4:34 PM 	Registry 	GetBinaryValue: \\XXXXXX.YYYYYY.de\HKEY_PERFORMANCE_DATA\10960 10740 10870 10808 10944 10852 11108 11048 	Cannot connect 	Win32Exception 	Die angeforderte Ressource wird bereits verwendet
    

    But the error only occurs approx. 1-2 times in a minute. The other times, the Logentries with "GetBinaryValue" have no errors..

    Thanks
    Kristofer
  • That translates to "The requested resource is in use" which suggests that some other application (or SQL Monitor perhaps) is using the registry key we're trying to access.

    Have you any other monitoring software that could possibly be attempting to connect to this server? I'm thinking of something like Microsoft's SQLH2 which collects data using remote registry in the same way we do.

    Also is it just one specific server or do all your servers have the issue?

    Thanks
    Chris
    Chris Spencer
    Test Engineer
    Red Gate
  • KristoferWKristoferW Posts: 10 Bronze 2
    Hi Chris,

    sorry for my late reply. I was ill last week and not in the office..
    Have you any other monitoring software that could possibly be attempting to connect to this server? I'm thinking of something like Microsoft's SQLH2 which collects data using remote registry in the same way we do.
    Yes, we have other monitoring software running. We have Nagios and a DELL Open Manage Software monitoring this server.

    I disabled the nagios service for half an hour, but that doesn't helped.
    So, maybe it is this DELL Software.. i will speak to our system admin, if he could disable that service for a few minutes.
    Also is it just one specific server or do all your servers have the issue?
    At the moment we only have one SQL Server that is monitored by SQL Monitor.

    Thanks
    Kristofer
  • Hi Kristofer

    I think that the chances of it being the DELL software is very low although it's definitely worth trying.

    It's probably worth restarting the remote registry service on the server to see if this helps.

    We are adjusting these alerts in v2.3 so it might be just a case of disabling these alerts for problematic servers until we can get the new version out.

    Regards
    Chris
    Chris Spencer
    Test Engineer
    Red Gate
Sign In or Register to comment.