Random 'Monitoring stopped' on one server

RichyDRichyD Posts: 8
edited November 7, 2013 6:16AM in SQL Monitor Previous Versions
I'm using SQL Monitor 3.5 to support about 15 servers, and am used to the occasional connection issue. One of my servers, however, is suffering from intermittent 'Monitoring stopped (SQL Server credentials)' errors throughout the day - on average about six times a day at random times. Each time, the alert is ended within a few seconds as SQLMonitor successfully reconnects.
The account being used for this server is the same Windows account for all, and no other servers are showing a similar problem...
I've checked the target SQL and Windows event logs, the monitoring server logs, and can't find anything at all to indicate a problem.
Has anyone else experienced this kind of thing, and/or have pointers on where to investigate next?

Cheers,
Rich

Comments

  • Btw, I've also checked the SQL Monitor 'Monitored Servers' machine log, but that only goes back about 3 minutes.
    The full SQL Monitor log shows all sorts of random exceptions, but none with times that correspond with the monitoring failures...
  • Brian DonahueBrian Donahue Posts: 6,590 Bronze 1
    If you are having login failures to SQL Server, the failure reason should be in the SQL Server error log. It is possible that you aren't auditing login failures, so you can check the settings and make sure: http://www.mssqltips.com/sqlservertip/1 ... ql-server/
  • Thanks for the suggestion Brian, but login failure auditing was already on and no SQL Monitor related events are in the SQL Server log. I have seen other user login failures, so is it definitely logging...

    This is one of the odd things about the problem - SQL Monitor saying that it has had SQL credential problems, but the SQL Server itself denies all knowledge. Most peculiar.
  • Brian DonahueBrian Donahue Posts: 6,590 Bronze 1
    It's probably a failure to connect to the Windows components then - check the server in "monitored servers" and next to the server, click "show log".
  • I checked the 'Show log', but never managed to get to it in time to see anything helpful - that log appears to only retain a few minutes of info...

    As an experiment, I moved the Base Monitor to a different server, and I haven't had a connect error since the move. It's only been two hours, but I'll keep my eye on it and hope that it was a problem with the monitor host.

    Things couldn't be totally fixed, of course - I've now got 100% CPU usage on the new host :( I'll raise that in a new thread if it doesn't settle down this afternoon...
  • Hi,

    Did you find moving the base monitor to a new server resolve this issue? I am having exactly the same symptoms with one server always having SQL Monitoring Stopped errors, with no login failures on the server. The server is working fine ad the error is ended in a few seconds.

    Thanks.
  • The best idea I've come up with so far is a pair of memory leaks in Windows - particularly one related to WMI. After a while, the memory allocated to the wmiprvse.exe service will reach 512MB, which is a cap - at this point any remote WMI calls will fail. After a few seconds, some garbage collection will occur to free some memory, and SQL Monitor will connect in again.
    I've scheduled a hotfix to be applied, but my OS team is slow to roll these things out, so i can't state if this will definitely solve the problem...

    For ref, the Windows 2008r2 hotfix is here: http://support.microsoft.com/kb/2832248, and the vanilla 2008 one is here:http://support.microsoft.com/kb/958124

    If that sorts out your issues, please let me know :)

    Rich
  • Perfect. Thanks for the quick reply, that's definitely something to keep an eye on. I'll see if this resolves the issue.

    Thanks again.
  • I managed to get the hotfix rolled out on one server yesterday morning, and haven't had a connection failure since. That's a success in my book :)
    I'll be rolling that hotfix out to all Windows 2008/r2 servers over the next month to make eliminate the rest of the connection failures I get.
Sign In or Register to comment.