1
0
Fork 0
mirror of https://github.com/monitoring-mixins/website.git synced 2024-12-14 11:37:31 +00:00
monitoring-mixins-website/site/content/velero/_index.md
Vitaly Zhuravlev b3b400137a Add jsonnet-libs mixins
Add blackbox exporter
Add mysql exporter
2024-05-04 12:01:41 +00:00

2.5 KiB

title
velero

Overview

{{< panel style="danger" >}} Jsonnet source code is available at github.com/grafana/jsonnet-libs {{< /panel >}}

Alerts

{{< panel style="warning" >}} Complete list of pregenerated alerts is available here. {{< /panel >}}

velero

VeleroBackupFailure

{{< code lang="yaml" >}} alert: VeleroBackupFailure annotations: description: | Backup failures detected on {{ $labels.instance }}. This could lead to data loss or inability to recover in case of a disaster. summary: Velero backup failures detected. expr: | increase(velero_backup_failure_total{job=~"integrations/velero"}[5m]) > 0 for: 5m labels: severity: critical {{< /code >}}

VeleroHighBackupDuration

{{< code lang="yaml" >}} alert: VeleroHighBackupDuration annotations: description: | Backup duration on {{ $labels.instance }} is higher than the average duration over the past 48 hours. This could indicate performance issues or network congestion. The current value is {{ $value | printf "%.2f" }} seconds. summary: Velero backups taking longer than usual. expr: | histogram_quantile(0.5, sum(rate(velero_backup_duration_seconds_bucket{job="integrations/velero"}[5m])) by (le, schedule)) > 1.2 * 1.2 * avg_over_time(histogram_quantile(0.5, sum(rate(velero_backup_duration_seconds_bucket{job="integrations/velero"}[48h])) by (le, schedule))[5m:]) for: 5m labels: severity: warning {{< /code >}}

VeleroHighRestoreFailureRate

{{< code lang="yaml" >}} alert: VeleroHighRestoreFailureRate annotations: description: | Restore failures detected on {{ $labels.instance }}. This could prevent timely data recovery and business continuity. summary: Velero restore failures detected. expr: | increase(velero_restore_failed_total{job=~"integrations/velero"}[5m]) > 0 for: 5m labels: severity: critical {{< /code >}}

VeleroUpStatus

{{< code lang="yaml" >}} alert: VeleroUpStatus annotations: description: "Cannot find any metrics related to Velero on {{ $labels.instance }}. This may indicate further issues with Velero or the scraping agent. " summary: Velero is down. expr: | up{job=~"integrations/velero"} != 0 for: 5m labels: severity: critical {{< /code >}}

Dashboards

Following dashboards are generated from mixins and hosted on github: