2.2.0.260 - misleading or bogus "ended" Job Duration Alert

PDinCAPDinCA Posts: 642 Silver 1
edited April 20, 2011 9:50AM in SQL Monitor Previous Versions
Job runs hourly and has a typical duration of under 20 seconds. Having just rewritten the underlying SP and added additional data points to evaluate, the job now runs between 1:35 and 2:26. The latest exec ran in 2 minutes 31, which gave rise to a "Job duration unusual" alert based on the prior 10 execs. No problem so far. "Time raised: 15 Apr 2011 7:27 PM (UTC-04)"

When an hour later an Alert arrived informing me that the Alert "Time ended" was 15 Apr 2011 8:27 PM (UTC-04), i.e., a full hour later for a job lasting 2m31secs, it looks confusing.

Is this the intended way the "Ended" Alert should work? My actual Alert shows:
Job name: _HourlyMeterSampling 
User: sa 
Job started at: 15 Apr 2011 7:25 PM 
Job ended at: 15 Apr 2011 7:27 PM 
Job outcome: Succeeded 
 
Duration: 00:02:31 
 
Baseline duration (median of last 10 runs): 00:01:40 
Deviation from baseline: 50% 
 
Job next scheduled to run at: 15 Apr 2011 9:25 PM 
As you can see, the Job started and ended times are correct, along with the duration. As the "Job next scheduled to run at:" value is TWO HOURS after the alerted Job, there appears to be a problem with the Alert in that the 8:25 PM execution is detected as the "Ended" trigger for the original Alert.

I've also had an instance where I received an "Ended" Alert for a job that purported to run for a solid week! It didn't. It just happened to run for an unusual duration and then 7 days later ran again, according to it's normal schedule.

I feel misled into investigating something that isn't an issue in reality.

Would you clarify, please, or is this a bug?

Thanks.
Jesus Christ: Lunatic, liar or Lord?
Decide wisely...

Comments

  • Hi PDinCA,

    This is not a bug but it is designed behaviour.

    SQL Monitor has two types of alerts Event alert and Continuous alert. "Job duration unusual" alert is classified as continuous alert.

    The designed behaviour is that, if say the current job run deviates from base line (as per configuration). This will then create a "Job duration unusal" alert. Now this alert will remain active till the subsequent job run deviation is within the threshold (as per configuration).

    In other words, the alert will remain active till it has verified that the subsequent job runs are not deviating as per the baseline.

    Hope this explains.

    Thanks,
    Priya
    Priya Sinha
    Project Manager
    Red Gate Software
  • PDinCAPDinCA Posts: 642 Silver 1
    It explains things, but in simple terms it means that I will completely ignore this Alert as it confuses, rather than points to a problem I need to address as far as I'm concerned.

    What happens when the job runs for the 2nd time and the alert is "active", do I get another Job Duration Unusual Alert? That's all I'd want to see - that each incident is reported to me. I can look at the occurrence history to determine if I have an ongoing issue...

    If there is a way to change the Alert, perhaps in a future version, so that the User can choose between a "Continuous Alert" and "Event Alert", I for one would be less confused but still able to look into a potential issue given a succession of the same Job Duration Unusual alerts. Your thoughts?
    Jesus Christ: Lunatic, liar or Lord?
    Decide wisely...
  • PDinCA wrote:
    What happens when the job runs for the 2nd time and the alert is "active", do I get another Job Duration Unusual Alert? That's all I'd want to see - that each incident is reported to me. I can look at the occurrence history to determine if I have an ongoing issue...

    If the job runs for 2nd time and the alert is still "active" then the "Job duration unsual" alert will remain active. Now, suppose there is a 3rd run and this time the deviation is within the range (no longer above configuration threshold) then the "Job duration unusual" alert will be marked as "ended". Now say there is 4th run and the job deviates again. Then you will get a new "Job duration unusual" alert.

    In other words, the alert will remain active till it has verified that the problem no longer exists. If the problem comes back again then a new instance of this alert will be created.
    PDinCA wrote:
    If there is a way to change the Alert, perhaps in a future version, so that the User can choose between a "Continuous Alert" and "Event Alert", I for one would be less confused but still able to look into a potential issue given a succession of the same Job Duration Unusual alerts. Your thoughts?

    Thanks for your feedback on this. We will definitely consider and review the behaviour of this alert.

    Regards,
    Priya
    Priya Sinha
    Project Manager
    Red Gate Software
  • PDinCAPDinCA Posts: 642 Silver 1
    Thanks for your explanation, Priya. The fact that the Alert stays in effect through the succeeding jobs that may also exceed the threshold confirms that, at least for me, an of-necessity DBA, having the Alert as an Event-based Alert is going to make me aware of the issue as against an Alert that is raised on a Sunday and stays in effect throughout the week, then stays in effect with no further "Job Duration Unusual" Alert to prompt me when the following Sunday's job also exceeds the norm. The original issue I had was with a daily job and another with a weekly job. A continuous Alert serves me poorly as I have plenty of other Alerts and tasks to perform to have to remember that, potentially over a week ago, I had an issue with the Sunday job.

    I hope this kind of scenario, not uncommon I would venture, can be used as collateral when considering providing the option to alter this to an Event Alert.

    1 vote hereby cast for this enhancement.

    Thanks.
    Jesus Christ: Lunatic, liar or Lord?
    Decide wisely...
  • Thanks PDinCA. I get your point.

    Regards,
    Priya
    Priya Sinha
    Project Manager
    Red Gate Software
Sign In or Register to comment.