SQL Monitor Alert for when SQL Monitor server goes down...?

robinwrobinw Posts: 17 New member
Hi all.
I have been trailing a scenario where the server running our sql monitor software goes down - hoping there'll be an alert pop up. Alas, there is no alert in sql monitor after a forced restart.
Is there any way I can get sql monitor to alert when either:
1. just as it goes down or
2. when the server has started back up again after it goes down?

I have attempted to alter the parameters in the 4 monitoring errors collection alerts but this does not seem to have an effect.
Using v9.0.10.

Thanks in advance
Tagged:

Answers

  • Russell DRussell D Posts: 1,324 Diamond 5
    Well I'm afraid not no - since one of the requirements for alerting is writing to the database, if the server goes down, we can't write to the database.

    You'd have to set up a secondary monitoring solution to monitor the SQL Monitor server.
    Have you visited our Help Centre?
  • tomdpetomdpe Posts: 10 Bronze 1
    edited July 30, 2019 7:23PM
    I am dealing with the same issue and would suggest that when there is an issue with the SQL Monitor database being unavailable, that it could be indicated by logging an alert to a local data file similar to one used by the SSMS Tab History and sending out that alert indicating the main SQL Monitor database is unavailable prior to suspending all other alerting would be a great help.  Once the DB is back online that entry could then be synced up as the first step of bringing everything online.  This would cover the need to logging all alerts and give us better assurance that we are covered.  Remember that your product is only as strong as your weakest link and this has caused issues for us on a couple of occasion.
  • DonFergusonDonFerguson Posts: 202 Silver 5
    Perhaps a "monitor the monitor" feature could be leveraged in a multi-base configuration.  Just an idea, but if anyone is interested, I'm sure we need to add it as a new feature suggestion. 
  • tomdpetomdpe Posts: 10 Bronze 1
    edited August 1, 2019 6:04PM
    Heck, a local log entry indicating when the SQL Monitor database goes offline and online along with an email sent and an internal process that syncs such log entries back to the db when it comes back online would be great.  This could be done as a separate alert type and even use a completely separate table in the db.   Setting up other monitoring watching the base monitor service won't catch this since the service stays online.  This means that without this type of check within SQL Monitor on its own db and hence its own SQL server, I am required to put a second different monitoring solution in place in order to properly monitor my sql servers.  That means when things are quiet I am left to question whether my SQL Servers are even being monitored or if SQL Monitor is actually doing nothing.  And that could leave me as the person in charge of making sure our DB's are up and working, wondering if things are ok.  And I have had this actually happen on more than 2 occasions.  This actually should be a priority fix.  Furthermore the above suggestion of employing a multi-base configuration is not a real solution to this problem since you still have the issue when not running a multi-base solution.  Also of note is that I run TB's of db's supporting large call centers across several states servicing 10's of millions of customers, I need to know those servers are up and working and need to be able to identify issues prior to them effecting my users else we loose thousands of dollars a minute in revenue. So I need a solution to this in pretty short order or I will have to find a different solution.  As it is now I have to find another way to monitor not just this server but the DB on it, since either going down leaves me with no SQL monitoring on any of my servers, and at this point SQL Monitor leaves me hanging without a clue.  One other thing of note is when the db does come back online it registers a. CPU spike across everyone of my servers and cause alerts to be sent out.  This false alert is what clued us into the above problem.  I have seen the spike after doing updates to the SQL Server It just didn't register that I had a much bigger problem at the time.
  • tomdpetomdpe Posts: 10 Bronze 1
    Let's see if I can make clear how particularly bad this issue is.  
    In a nutshell at no time without actually being in and looking at SQL Monitor do I know with any certainty that my Sql servers and db's are being monitored unless I have another solution monitoring the SQL Server that houses my SQL Monitor DB and actually monitoring that individual db since it is when the DB goes offline that SQL Monitor stops monitoring without any indication that there is a problem.  This actually means my entire environment  could be having issues and SQL Monitor could be in a compromised state and no one would be the wiser.  In essence until there is some alert coming back from SQL Monitor that its database is offline It cannot be trusted.  Do any of your customers know this.  Is it pointed out anywhere that if you are going to use SQL Monitor to monitor your SQL Server environment you need to have a different SQL Server monitoring system in place to make sure its db is up and that it is able to monitor.  I run servers with databases backing millions of customers and call centers with thousands of users and I am now finding out that SQL Monitor cannot be trusted to alert me when its own database goes offline and that Red-Gate is passing it off as since it cannot be logged to a db we can't alert on it.  SQL Monitor SHOULD BE SCREAMING WHEN THIS HAPPENS.  And Red-Gate should issuing a notice with an intent to correct not an excuse and a non response.  
  • Russell DRussell D Posts: 1,324 Diamond 5
    edited August 2, 2019 12:42PM
    Realistically speaking a day or two isn't much to respond. I think it's more complicated than simply issuing a notice to correct. We're discussing this internally and will come back to you.
    Have you visited our Help Centre?
  • Russell DRussell D Posts: 1,324 Diamond 5
    edited August 2, 2019 12:49PM
    So there are two issues here, the service outage and the database outage. We've logged two internal issues to investigate this because they are fundamentally two different things, and have their own engineering challenges.
    We're not entirely convinced that the service outage is something we can do much about, given the scope or context, but the database outage is definitely something we want to look into. At this stage we've not really got an idea of the complexity or amount of work required so can't promise any ETAs, I certainly wouldn't suggest that this will be a quick fix.
    Have you visited our Help Centre?
  • Russell DRussell D Posts: 1,324 Diamond 5
    edited August 22, 2019 9:02AM
    Notification emails for the repository and basemonitor being unavailable have been added in 9.1.2:


    This will also send an email if the basemonitor Service goes down. If the web service goes down though I'm afraid it's not possible to send a notification.
    Have you visited our Help Centre?
  • DonFergusonDonFerguson Posts: 202 Silver 5
    edited August 22, 2019 4:24PM
    This is a nice new feature.  My one piece of feedback is that it would be nice if I could send this specific alert to a custom email address.  The default address doesn't really work for me without major reconfiguration of all the other alerts.  I guess it's time for me to learn how to use the new powershell API to make the major reconfiguration of email address for all alerts easier. ;^)
  • Thanks for the feedback Don - we talked about this but wanted to get something useful out early on. I'll raise this at standup later this morning.
    Have you visited our Help Centre?
  • robinwrobinw Posts: 17 New member
    Hello everyone.
    I must admit, I didn't know that this request would gain so much traction in such a short space of time. It's been interesting to read everyone's opinion and feedback on this issue.

    Thank you to everyone for throwing their support behind this and special thanks to the people at Redgate for listening and for the quick turnaround time in releasing the new functionality.
Sign In or Register to comment.