should Blocking Process Alert thresholds be based on duration, not cumulative time
kevriley
Posts: 129 Gold 1
Blocking Process alerts are fired when a threshold of time is passed, however I think the calculation that is used needs some tweaking.
From what I've seen, if a process_A is blocking for a duration of 30 seconds, but it blocks 10 other processes, then the 'blocking time' is given as 300 seconds. So the alert would say that process_A has been blocking for 5 minutes, and the user is confused as they only ran the query for 30s.
So whilst I think the cumulative total is a useful metric to see, I'm not sure that it should be used as the metric that is used to measure against the thresholds set the alert configuration.
A process that blocks for a very short amount of time, but affects many multiple other processes, is not *really* a problem.
Thoughts?
From what I've seen, if a process_A is blocking for a duration of 30 seconds, but it blocks 10 other processes, then the 'blocking time' is given as 300 seconds. So the alert would say that process_A has been blocking for 5 minutes, and the user is confused as they only ran the query for 30s.
So whilst I think the cumulative total is a useful metric to see, I'm not sure that it should be used as the metric that is used to measure against the thresholds set the alert configuration.
A process that blocks for a very short amount of time, but affects many multiple other processes, is not *really* a problem.
Thoughts?
Comments