Did Not Receive Job Duration Alert

cehottlecehottle Posts: 38
edited March 31, 2011 10:48AM in SQL Monitor Previous Versions
I discovered this morning that there was job on one of my monitored servers that had been running for 4 days. I did not receive an alert regarding this. I have received job duration alerts from other servers in this monitored group. The threshold is set for 60% above baseline for jobs that run for more than 60 seconds.

Comments

  • Has this job been running okay in the past?

    I was just wondering if SQL Monitor would have managed to get enough data from previous runs in order to calculate a baseline.

    Regards
    Chris
    Chris Spencer
    Test Engineer
    Red Gate
  • This server has been monitored by SQL Monitor since November and the job, which coincidently happens to be a t-log backup using SQL Backup, runs every 15 minutes, usually in less than 10 seconds.
  • Did you get an alert after manually ending the job?

    I suspect that the reason an alert was not raised was because job alerts in SQL Monitor 2 are based purely on the job history data from SQL Server. Until we see that the job has ended in the job history then we are unable to calculate how long the job has been running in order to send an alert.

    Regards
    Chris
    Chris Spencer
    Test Engineer
    Red Gate
  • No, I did not get an alert when I killed the process/job. I really thought that this was a duration check independent of completion. SQL Sentry, which I replaced with SQL Response over a year ago has a duration check that is independent of the job completion. I mentioned that during the feature interview that I had for SQL Monitor quite some time ago. A duration check for a job that is running would be very useful to have.
  • I agree that this would be a very useful alert and have raised an enhancement request (ref: SRP-3356).
    Chris Spencer
    Test Engineer
    Red Gate
  • In SQL Response there was an alert labeled "Job did not start". I do not see that alert in SQL Monitor and wonder what happened to it. It seems that it was a good compliment to the "Job duration unusual", especially in cases like the one posted above. We had a similar issue where a job hung and we didn't notice it for over a day. In the past with SQL Response, I thought we would've gotten an alert that a scheduled job didn't start.

    Or maybe that wouldn't be true if you can only determine that after the previous instance of the job finishes, which in our case didn't happen until we killed it.

    Can you explain why the "did not start" alert is no longer part of the product?
  • Although many people found the "Job did not start" alert useful we had many customers with a large number of false positive alerts.

    After researching into the cause of these issue and talking to Microsoft about the SQL Server Agent scheduler we were not confident that we could predict when jobs would run correctly. Specifically some reoccurring jobs did not appear to run in a deterministically scheduled way.

    We did not feel that the alert meet the minimum functional quality for us to include it in SQL Monitor so we unfortunately had to withdraw it.

    --
    Daniel
Sign In or Register to comment.