Skip to content
Snippets Groups Projects
Commit 57489c55 authored by ale's avatar ale
Browse files

Add alert for broken end-to-end log collection probes

parent 29d93176
Branches
No related tags found
No related merge requests found
......@@ -29,3 +29,13 @@ groups:
summary: Elasticsearch on {{$labels.host}} is about to run out of disk space
description: "The Elasticsearch instance on {{$labels.host}} has > 90% disk utilization on its data volume. When the utilization reaches 95%, ES will switch indexes to read-only mode and we'll start discarding logs. Try to free some space."
- alert: LogCollectionBroken
expr: log_collection_e2e:success:ratio < 0.5
for: 1h
labels:
severity: page
scope: global
annotations:
summary: Logs are not being indexed
description: "The end-to-end log testing system has detected that logs are not reaching the Elasticsearch index. Something must be broken either with Elasticsearch itself, or with the log-collector service (rsyslog)."
......@@ -5,3 +5,8 @@ groups:
expr: 100 * (elasticsearch_filesystem_data_size_bytes - elasticsearch_filesystem_data_free_bytes) / elasticsearch_filesystem_data_size_bytes
- record: elasticsearch_filesystem_data_free_percent
expr: 100 - elasticsearch_filesystem_data_used_percent
# Metrics for the end-to-end probers.
- record: log_collection_e2e:success:ratio
expr: sum(log_collection_e2e_success) / count(log_collection_e2e_success)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment