'Job did not start' alerts for 0 duration jobs

kevrileykevriley Posts: 129 Gold 1
I have been getting 'job did not start' alerts for some of the jobs on my production server, but when I check the jobs in SSMS, they have started, succesfully competed, but have a duration of 00:00:00

I guess this 0 duration is somehow causing the alert to fire.

Can anyone confirm this to be the case?


Kev

SQL Response 1.2.0.219

Comments

  • Brian DonahueBrian Donahue Posts: 6,590 Bronze 1
    Hi Kevan,

    SQL Response will check to see that a scheduled job had run by examining its' "next run date" and "last run date" fields in the MSDB database. It should not discriminate against jobs that have no duration because it does not check that, except when checking for long-running jobs. There are other reasons why you may get a job not run alert:
    Possible causes:
    Previous execution of this job was overrunning
    SQL Server Agent was not running (a SQL Server Agent not running alert is raised when SQL Response first detects that SQL Server Agent is not running)
    No target servers defined

    Could it be one of these?
  • kevrileykevriley Posts: 129 Gold 1
    Brian,

    thanks for the response. I have rechecked one of the alerts

    Previous execution of this job was overrunning - No - in fact the previous run had also finished with 0 duration, but that hadn't fired an alert

    SQL Server Agent was not running (a SQL Server Agent not running alert is raised when SQL Response first detects that SQL Server Agent is not running) - No - agent was certainly running

    No target servers defined - No, the target is set to 'local server'


    It is strange that other runs of the same job that have 0 duration are not triggering the alert - so I guess it isn't strictly a 0 duration issue. Could it be that SQL Response is doing a dirty read on the MSDB database?


    Kev
  • Hi Kev,

    We've had some issues in the past with false positives with the job did not run alert.

    What is the schedule of the job?

    Cheers,
    --
    Daniel
  • kevrileykevriley Posts: 129 Gold 1
    Daniel,

    schedule is "Occurs every day every 12 minute(s) between 00:10:00 and 23:40:00"

    Kev
  • Hi Kev,

    Does there seem to be any pattern to the false positives?

    I'm going to try and replicate the issue on a test machine.

    Cheers,
    --
    Daniel
  • kevrileykevriley Posts: 129 Gold 1
    Daniel,

    there seems to be no pattern. It is affecting different jobs at different times on different servers.

    Kev
  • Hi Kev,

    I've not had any joy reproducing the issue.

    What reason does SQL Response give for the job not running?
    Are the alert repository and the monitored server in the same time zone?
  • kevrileykevriley Posts: 129 Gold 1
    Daniel,

    Reason for not starting: Reasons unknown

    All servers are on the same timezone

    Kev
  • Hi Kev,

    Could you email me a copy of the job history and the details of some of the false positive alerts that you've seen so we can try to correlate the issue.

    Cheers,
    --
    Daniel
  • kevrileykevriley Posts: 129 Gold 1
    Daniel,

    will do. Should I use support@red-gate.com?

    How do you want the data? Job history I could export to Excel.
    Alerts - how do you want these?


    Kev
  • Hi Kev,

    yes, support@red-gate.com will be fine.

    Exported to excel or csv would be ideal.

    A few screen shots of the alerts and a list of the most recent occurrences would be great.

    Cheers,
    --
    Daniel
  • kevrileykevriley Posts: 129 Gold 1
    Daniel,

    have just emailed you.

    Kev
  • Just to close off this thread:

    We have managed to reproduce the problem and have logged it in our bug track system.

    Unfortunately we haven’t come up with a work around, but we are hoping to address this in an upcoming release.

    --
    Daniel
Sign In or Register to comment.