Skip to content
Snippets Groups Projects
Commit c32336a6 authored by ale's avatar ale
Browse files

Conflate "deferred" and "active" Postfix queues for alerting

parent 1e7ed15b
No related branches found
No related tags found
No related merge requests found
...@@ -11,7 +11,7 @@ groups: ...@@ -11,7 +11,7 @@ groups:
# The postfix-out instances should be allowed to have a large # The postfix-out instances should be allowed to have a large
# deferred queue for outbound messages. # deferred queue for outbound messages.
- alert: PostfixQueueTooLarge - alert: PostfixQueueTooLarge
expr: postfix_queue_length{postfix_instance="postfix-out",queue="deferred"} > 5000 expr: postfix_queue_length{postfix_instance="postfix-out",queue=~"(deferred|active)"} > 5000
for: 10m for: 10m
labels: labels:
severity: warn severity: warn
...@@ -25,7 +25,7 @@ groups: ...@@ -25,7 +25,7 @@ groups:
deliveries. deliveries.
runbook: '[[ alert_runbook_fmt | format("PostfixQueueTooLarge") ]]' runbook: '[[ alert_runbook_fmt | format("PostfixQueueTooLarge") ]]'
- alert: PostfixQueueTooLarge - alert: PostfixQueueTooLarge
expr: sum(postfix_queue_length{postfix_instance="postfix-out",queue="deferred"}) > 10000 expr: sum(postfix_queue_length{postfix_instance="postfix-out",queue=~"(deferred|active)"}) > 10000
for: 10m for: 10m
labels: labels:
severity: page severity: page
...@@ -71,7 +71,7 @@ groups: ...@@ -71,7 +71,7 @@ groups:
# all. Note the longer timeout: it is fine for queues like # all. Note the longer timeout: it is fine for queues like
# 'active' or 'incoming' to accomodate temporary spikes. # 'active' or 'incoming' to accomodate temporary spikes.
- alert: PostfixUnexpectedQueueTooLarge - alert: PostfixUnexpectedQueueTooLarge
expr: postfix_queue_length{queue!="deferred"} > 50 expr: postfix_queue_length{queue!~"(deferred|active)"} > 50
for: 1h for: 1h
labels: labels:
severity: page severity: page
...@@ -85,7 +85,7 @@ groups: ...@@ -85,7 +85,7 @@ groups:
service malfunctioning, or having capacity issues. service malfunctioning, or having capacity issues.
runbook: '[[ alert_runbook_fmt | format("PostfixQueueTooLarge") ]]' runbook: '[[ alert_runbook_fmt | format("PostfixQueueTooLarge") ]]'
- alert: PostfixUnexpectedQueueTooLarge - alert: PostfixUnexpectedQueueTooLarge
expr: sum(postfix_queue_length{queue!="deferred"}) by (postfix_instance, queue) > 100 expr: sum(postfix_queue_length{queue!~"(deferred|active)"}) by (postfix_instance, queue) > 100
for: 1h for: 1h
labels: labels:
severity: page severity: page
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment