mirror of
https://github.com/prometheus-operator/prometheus-operator.git
synced 2025-04-21 03:38:43 +00:00
Documentation: add more content to online docs (#5060)
* Documentation: add more content to online docs This change adds the following content to prometheus-operator.dev website: * New "User Guides" section with the "Getting Started" and "Alerting" guides. I've updated/cleaned up the existing content to match with the current release of the operator. * "Storage" and "Strategic Merge Patch" pages to the Operator section. The "Storage" page also documents how to manually expand statefulset volumes (related to #4079). Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Address Philip's comments Signed-off-by: Simon Pasquier <spasquie@redhat.com> Signed-off-by: Simon Pasquier <spasquie@redhat.com>
This commit is contained in:
parent
fb52a0f075
commit
9a7a6efb89
13 changed files with 455 additions and 258 deletions
CONTRIBUTING.md
Documentation
example/user-guides/alerting
scripts/docs/templates
|
@ -1,5 +1,5 @@
|
|||
---
|
||||
weight: 200
|
||||
weight: 120
|
||||
toc: true
|
||||
title: Contributing
|
||||
menu:
|
||||
|
|
|
@ -3,10 +3,8 @@ title: "API reference"
|
|||
description: "Prometheus operator generated API reference docs"
|
||||
draft: false
|
||||
images: []
|
||||
menu:
|
||||
docs:
|
||||
parent: "operator"
|
||||
weight: 208
|
||||
menu: "operator"
|
||||
weight: 210
|
||||
toc: true
|
||||
---
|
||||
> This page is automatically generated with `gen-crd-api-reference-docs`.
|
||||
|
|
|
@ -139,22 +139,9 @@ spec:
|
|||
groupWait: 30s
|
||||
groupInterval: 5m
|
||||
repeatInterval: 12h
|
||||
receiver: 'wechat-example'
|
||||
receiver: 'webhook'
|
||||
receivers:
|
||||
- name: 'wechat-example'
|
||||
wechatConfigs:
|
||||
- apiURL: 'http://wechatserver:8080/'
|
||||
corpID: 'wechat-corpid'
|
||||
apiSecret:
|
||||
name: 'wechat-config'
|
||||
key: 'apiSecret'
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
type: Opaque
|
||||
metadata:
|
||||
name: wechat-config
|
||||
data:
|
||||
apiSecret: d2VjaGF0LXNlY3JldAo=
|
||||
- name: 'webhook'
|
||||
webhookConfigs:
|
||||
- api: 'http://example.com/'
|
||||
```
|
||||
|
|
Before ![]() (image error) Size: 148 KiB After ![]() (image error) Size: 148 KiB ![]() ![]() |
|
@ -1,5 +1,5 @@
|
|||
---
|
||||
weight: 206
|
||||
weight: 208
|
||||
toc: true
|
||||
title: Thanos
|
||||
menu:
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
---
|
||||
weight: 207
|
||||
weight: 209
|
||||
toc: true
|
||||
title: Troubleshooting
|
||||
menu:
|
||||
|
|
|
@ -1,25 +1,52 @@
|
|||
<br>
|
||||
<div class="alert alert-info" role="alert">
|
||||
<i class="fa fa-exclamation-triangle"></i><b> Note:</b> Starting with v0.39.0, Prometheus Operator requires use of Kubernetes v1.16.x and up.<br><br>
|
||||
</div>
|
||||
---
|
||||
weight: 152
|
||||
toc: true
|
||||
title: Alerting
|
||||
menu:
|
||||
docs:
|
||||
parent: user-guides
|
||||
lead: ""
|
||||
images: []
|
||||
draft: false
|
||||
description: Alerting guide
|
||||
---
|
||||
|
||||
# Alerting
|
||||
This guide assumes that you have a basic understanding of the Prometheus
|
||||
operator, and that you have already followed the [Getting Started]({{< ref
|
||||
"getting-started" >}}) guide.
|
||||
|
||||
This guide assumes you have a basic understanding of the `Prometheus` resource and have read the [getting started guide](getting-started.md).
|
||||
{{< alert icon="👉" text="Prometheus Operator requires use of Kubernetes v1.16.x and up."/>}}
|
||||
|
||||
The Prometheus Operator introduces an Alertmanager resource, which allows users to declaratively describe an Alertmanager cluster. To successfully deploy an Alertmanager cluster, it is important to understand the contract between Prometheus and Alertmanager.
|
||||
The Prometheus Operator introduces an `Alertmanager` resource, which allows
|
||||
users to declaratively describe an Alertmanager cluster. To successfully deploy
|
||||
an Alertmanager cluster, it is important to understand the contract between
|
||||
Prometheus and Alertmanager. Alertmanager is used to:
|
||||
|
||||
The Alertmanager may be used to:
|
||||
* Deduplicate alerts received from Prometheus.
|
||||
* Silence alerts.
|
||||
* Route and send grouped notifications to various integrations (PagerDuty, OpsGenie, mail, chat, ...).
|
||||
|
||||
* Deduplicate alerts fired by Prometheus
|
||||
* Silence alerts
|
||||
* Route and send grouped notifications via providers (PagerDuty, OpsGenie, ...)
|
||||
The Prometheus Operator also introduces an `AlertmanagerConfig` resource, which
|
||||
allows users to declaratively describe Alertmanager configurations.
|
||||
|
||||
Prometheus' configuration also includes "rule files", which contain the [alerting rules](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/). When an alerting rule triggers it fires that alert against *all* Alertmanager instances, on *every* rule evaluation interval. The Alertmanager instances communicate to each other which notifications have already been sent out. For more information on this system design, see the [High Availability scheme description](../high-availability.md).
|
||||
> Note: The AlertmanagerConfig resource is currently v1alpha1, testing and feedback are welcome.
|
||||
|
||||
The Prometheus Operator also introduces an AlertmanagerConfig resource, which allows users to declaratively describe Alertmanager configurations. The AlertmanagerConfig resource is currently v1alpha1, testing and feedback are welcome.
|
||||
Prometheus' configuration also includes "rule files", which contain the
|
||||
[alerting
|
||||
rules](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/).
|
||||
When an alerting rule triggers, it fires that alert against *all* Alertmanager
|
||||
instances, on *every* rule evaluation interval. The Alertmanager instances
|
||||
communicate to each other which notifications have already been sent out. For
|
||||
more information on this system design, see the [High Availability]({{< ref "high-availability" >}})
|
||||
page.
|
||||
|
||||
First, create an example Alertmanager cluster, with three instances.
|
||||
## Pre-requisites
|
||||
|
||||
You have a running Prometheus operator.
|
||||
|
||||
## Deploying Alertmanager
|
||||
|
||||
First, let's create a Alertmanager cluster with three replicas:
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/alerting/alertmanager-example.yaml"
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
@ -30,13 +57,87 @@ spec:
|
|||
replicas: 3
|
||||
```
|
||||
|
||||
The Alertmanager instances will not be able to start up, unless a valid configuration is given. An config file Secret will be composed by taking an optional base config file Secret specified through the `configSecret` field in the Alertmanager resource Spec, and merging that with any AlertmanagerConfig resources that get matched by using the `alertmanagerConfigSelector` and `alertmanagerConfigNamespaceSelector` selectors from the `Alertmanager` resource.
|
||||
Wait for all Alertmanager pods to be ready:
|
||||
|
||||
For more information on configuring Alertmanager, see the Prometheus [Alerting Configuration document](https://prometheus.io/docs/alerting/configuration/).
|
||||
```bash
|
||||
kubectl get pods -l alertmanager=main -w
|
||||
```
|
||||
|
||||
## AlertmanagerConfig Resource
|
||||
## Managing Alertmanager configuration
|
||||
|
||||
The following example configuration creates an AlertmanagerConfig resource that sends notifications to a non-existent `wechat` receiver:
|
||||
By default, the Alertmanager instances will start with a minimal configuration
|
||||
which isn't really useful since it doesn't send any notification when receiving
|
||||
alerts.
|
||||
|
||||
You have several options to provide the [Alertmanager configuration](https://prometheus.io/docs/alerting/configuration/):
|
||||
1. You can use a native Alertmanager configuration file stored in a Kubernetes secret.
|
||||
2. You can use `spec.alertmanagerConfiguration` to reference an
|
||||
AlertmanagerConfig object in the same namespace which defines the main
|
||||
Alertmanager configuration.
|
||||
3. You can define `spec.alertmanagerConfigSelector` and
|
||||
`spec.alertmanagerConfigNamespaceSelector` to tell the operator which
|
||||
AlertmanagerConfigs objects should be selected and merged with the main
|
||||
Alertmanager configuration.
|
||||
|
||||
### Using a Kubernetes Secret
|
||||
|
||||
The following native Alertmanager configuration sends notifications to a fictuous webhook service:
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/alerting/alertmanager.yaml"
|
||||
route:
|
||||
group_by: ['job']
|
||||
group_wait: 30s
|
||||
group_interval: 5m
|
||||
repeat_interval: 12h
|
||||
receiver: 'webhook'
|
||||
receivers:
|
||||
- name: 'webhook'
|
||||
webhook_configs:
|
||||
- url: 'http://example.com/'
|
||||
```
|
||||
|
||||
Save the above configuration in a file called `alertmanager.yaml` in the local directory and create a Secret from it:
|
||||
|
||||
```bash
|
||||
kubectl create secret generic alertmanager-example --from-file=alertmanager.yaml
|
||||
```
|
||||
|
||||
The Prometheus operator requires the Secret to be named like
|
||||
`alertmanager-{ALERTMANAGER_NAME}`. In the previous example, the name of the
|
||||
Alertmanager is `example`, so the secret name must be `alertmanager-example`.
|
||||
The name of the key holding the configuration data in the Secret has to be
|
||||
`alertmanager.yaml`.
|
||||
|
||||
> Note: if you want to use a different secret name, you can specify it with the `spec.configSecret` field in the Alertmanager resource.
|
||||
|
||||
The Alertmanager configuration may reference custom templates or password files
|
||||
on disk. These can be added to the Secret along with the `alertmanager.yaml`
|
||||
configuration file. For example, provided that we have the following Secret:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: alertmanager-example
|
||||
data:
|
||||
alertmanager.yaml: {BASE64_CONFIG}
|
||||
template_1.tmpl: {BASE64_TEMPLATE_1}
|
||||
template_2.tmpl: {BASE64_TEMPLATE_2}
|
||||
```
|
||||
|
||||
Templates will be accessible to the Alertmanager container under the
|
||||
`/etc/alertmanager/config` directory. The Alertmanager
|
||||
configuration can reference them like this:
|
||||
|
||||
```yaml
|
||||
templates:
|
||||
- '/etc/alertmanager/config/*.tmpl'
|
||||
```
|
||||
|
||||
### Using AlertmanagerConfig Resources
|
||||
|
||||
The following example configuration creates an AlertmanagerConfig resource that
|
||||
sends notifications to a fictuous webhook service.
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/alerting/alertmanager-config-example.yaml"
|
||||
apiVersion: monitoring.coreos.com/v1alpha1
|
||||
|
@ -51,33 +152,23 @@ spec:
|
|||
groupWait: 30s
|
||||
groupInterval: 5m
|
||||
repeatInterval: 12h
|
||||
receiver: 'wechat-example'
|
||||
receiver: 'webhook'
|
||||
receivers:
|
||||
- name: 'wechat-example'
|
||||
wechatConfigs:
|
||||
- apiURL: 'http://wechatserver:8080/'
|
||||
corpID: 'wechat-corpid'
|
||||
apiSecret:
|
||||
name: 'wechat-config'
|
||||
key: 'apiSecret'
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
type: Opaque
|
||||
metadata:
|
||||
name: wechat-config
|
||||
data:
|
||||
apiSecret: d2VjaGF0LXNlY3JldAo=
|
||||
- name: 'webhook'
|
||||
webhookConfigs:
|
||||
- api: 'http://example.com/'
|
||||
```
|
||||
|
||||
Save the above AlertmanagerConfig in a file called `alertmanager-config.yaml` and create a resource from it using `kubectl`.
|
||||
Create the AlertmanagerConfig resource in your cluster:
|
||||
|
||||
```bash
|
||||
$ kubectl create -f alertmanager-config.yaml
|
||||
curl -sL https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/example/user-guides/alerting/alertmanager-config-example.yaml | kubectl create -f -
|
||||
```
|
||||
|
||||
The `alertmanagerConfigSelector` field in the Alertmanager resource Spec needs to be specified so that the operator can select such AlertmanagerConfig resources. In the previous example, the label `alertmanagerConfig: example` is added, so the Alertmanager instance should be updated, adding the `alertmanagerConfigSelector`:
|
||||
The `spec.alertmanagerConfigSelector` field in the Alertmanager resource
|
||||
needs to be updated so the operator selects AlertmanagerConfig resources. In
|
||||
the previous example, the label `alertmanagerConfig: example` is added, so the
|
||||
Alertmanager object should be updated like this:
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/alerting/alertmanager-selector-example.yaml"
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
@ -91,9 +182,11 @@ spec:
|
|||
alertmanagerConfig: example
|
||||
```
|
||||
|
||||
## Specify Global Alertmanager Config
|
||||
### Using AlertmanagerConfig for global configuration
|
||||
|
||||
The following example configuration creates an Alertmanager resource that specify an AlertmanagerConfig resource to be global(won't force add a `namespace` label in routes and inhibitRules):
|
||||
The following example configuration creates an Alertmanager resource that uses
|
||||
an AlertmanagerConfig resource to be used for the Alertmanager configuration
|
||||
instead of the `alertmanager-main` secret.
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/alerting/alertmanager-example-alertmanager-configuration.yaml"
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
@ -107,62 +200,15 @@ spec:
|
|||
name: example-config
|
||||
```
|
||||
|
||||
The AlertmanagerConfig resource named `example-config` in namespace `default` will be an global AlertmanagerConfig. When generating configs in alertmanager, routes and inhibitRules in the AlertmanagerConfig will not be force add a namespace label.
|
||||
The AlertmanagerConfig resource named `example-config` in namespace `default`
|
||||
will be a global AlertmanagerConfig. When the operator generates the
|
||||
Alertmanager configuration from it, the namespace label will not be enforced
|
||||
for routes and inhibittion rules.
|
||||
|
||||
## Manually Managed Secret
|
||||
## Exposing the Alertmanager service
|
||||
|
||||
The following example configuration sends notifications against to a `webhook`:
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/alerting/alertmanager.yaml"
|
||||
global:
|
||||
resolve_timeout: 5m
|
||||
route:
|
||||
group_by: ['job']
|
||||
group_wait: 30s
|
||||
group_interval: 5m
|
||||
repeat_interval: 12h
|
||||
receiver: 'webhook'
|
||||
receivers:
|
||||
- name: 'webhook'
|
||||
webhook_configs:
|
||||
- url: 'http://alertmanagerwh:30500/'
|
||||
```
|
||||
|
||||
Save the above Alertmanager config in a file called `alertmanager.yaml` and create a secret from it using `kubectl`
|
||||
|
||||
Alertmanager instances require the secret resource naming to follow the format
|
||||
`alertmanager-{ALERTMANAGER_NAME}`. In the previous example, the name of the Alertmanager is `example`, so the secret name must be `alertmanager-example`, and the name of the config file `alertmanager.yaml`. Also, the name of the secret can be set through the field `configSecret` in Alertmanager configuration, if you desire to use a different one.
|
||||
|
||||
```bash
|
||||
$ kubectl create secret generic alertmanager-example --from-file=alertmanager.yaml
|
||||
```
|
||||
|
||||
Note that Alertmanager configurations can use templates (`.tmpl` files), which can be added on the secret along with the `alertmanager.yaml` config file. For example:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: alertmanager-example
|
||||
data:
|
||||
alertmanager.yaml: {BASE64_CONFIG}
|
||||
template_1.tmpl: {BASE64_TEMPLATE_1}
|
||||
template_2.tmpl: {BASE64_TEMPLATE_2}
|
||||
...
|
||||
```
|
||||
|
||||
Templates will be placed on the same path as the configuration. To load the templates, the configuration (`alertmanager.yaml`) should point to them:
|
||||
|
||||
```yaml
|
||||
templates:
|
||||
- '*.tmpl'
|
||||
```
|
||||
|
||||
## Expose Alertmanager
|
||||
|
||||
Once the operator merges the optional manually specified Secret with any selected `AlertmanagerConfig` resources, a new configuration Secret is created with the name `alertmanager-<Alertmanager name>-generated`, and is mounted into Alertmanager Pods created through the Alertmanager object.
|
||||
|
||||
To be able to view the web UI, expose it through a Service. A simple way to do this is to use a Service of type `NodePort`.
|
||||
To access the Alertmanager interface, you have to expose the service to the outside. For
|
||||
simplicity, we use a `NodePort` Service.
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/alerting/alertmanager-example-service.yaml"
|
||||
apiVersion: v1
|
||||
|
@ -181,11 +227,19 @@ spec:
|
|||
alertmanager: example
|
||||
```
|
||||
|
||||
Once created it allows the web UI to be accessible via a Node's IP and the port `30903`.
|
||||
Once the Service is created, the Alertmanager web server is available under the
|
||||
node's IP address on port `30903`.
|
||||
|
||||
## Fire Alerts
|
||||
> Note: Exposing the Alertmanager web server this way may not be an applicable solution. Read more about the possible options in the [Ingress guide](exposing-prometheus-and-alertmanager.md).
|
||||
|
||||
This Alertmanager cluster is now fully functional and highly available, but no alerts are fired against it. Create Prometheus instances to fire alerts to the Alertmanagers.
|
||||
## Integrating with Prometheus
|
||||
|
||||
### Configuring Alertmanager in Prometheus
|
||||
|
||||
This Alertmanager cluster is now fully functional and highly available, but no
|
||||
alerts are fired against it.
|
||||
|
||||
First, create a Prometheus instance that will send alerts to the Alertmanger cluster:
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/alerting/prometheus-example.yaml"
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
@ -208,19 +262,29 @@ spec:
|
|||
prometheus: example
|
||||
```
|
||||
|
||||
The above configuration specifies a `Prometheus` that finds all of the Alertmanagers behind the `Service` created with `alertmanager-example-service.yaml`. The `alertmanagers` `name` and `port` fields should match those of the `Service` to allow this to occur.
|
||||
The `Prometheus` resource discovers all of the Alertmanager instances behind
|
||||
the `Service` created before (pay attention to `name`, `namespace` and `port`
|
||||
fields which should match with the definition of the Alertmanager Service).
|
||||
|
||||
### Rule Selection
|
||||
Open the Prometheus web interface, go to the "Status > Runtime & Build
|
||||
Information" page and check that the Prometheus has discovered 3 Alertmanager
|
||||
instances.
|
||||
|
||||
Prometheus rule files are held in `PrometheusRule` custom resources. Use the label selector field `ruleSelector` in the Prometheus object to define the rule files that you want to be mounted into Prometheus.
|
||||
### Deploying Prometheus Rules
|
||||
|
||||
By default, only `PrometheusRule` custom resources in the same namespace as the `Prometheus` custom resource are discovered.
|
||||
The `PrometheusRule` CRD allows to define alerting and recording rules. The
|
||||
operator knows which PrometheusRule objects to select for a given Prometheus
|
||||
based on the `spec.ruleSelector` field.
|
||||
|
||||
This can be further controlled with the `ruleNamespaceSelector` field, which is a [`metav1.LabelSelector`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.23/#labelselector-v1-meta).
|
||||
> Note: by default, `spec.ruleSelector` is nil meaning that the operator picks up no rule.
|
||||
|
||||
To discover from all namespaces, pass an empty dict (`ruleNamespaceSelector: {}`).
|
||||
By default, the Prometheus resources discovers only `PrometheusRule` resources
|
||||
in the same namespace. This can be refined with the `ruleNamespaceSelector` field:
|
||||
* To discover rules from all namespaces, pass an empty dict (`ruleNamespaceSelector: {}`).
|
||||
* To discover rules from all namespaces matching a certain label, use the `matchLabels` field.
|
||||
|
||||
To discover from all namespaces with a certain label, use the `matchLabels` field:
|
||||
Discover `PrometheusRule` resources with `role=alert-rules` and
|
||||
`prometheus=example` labels from all namespaces with `team=frontend` label:
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/alerting/prometheus-example-rule-namespace-selector.yaml"
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
@ -246,13 +310,14 @@ spec:
|
|||
team: frontend
|
||||
```
|
||||
|
||||
This will discover `PrometheusRule` custom resources from all namespaces with a `team=frontend` label.
|
||||
In case you want to select individual namespace by their name, you can use the
|
||||
`kubernetes.io/metadata.name` label, which gets populated automatically with
|
||||
the
|
||||
[`NamespaceDefaultLabelName`](https://kubernetes.io/docs/reference/labels-annotations-taints/#kubernetes-io-metadata-name)
|
||||
feature gate.
|
||||
|
||||
In case you want to select individual namespace by their name, you can use the `kubernetes.io/metadata.name` label, which gets populated automatically with the [`NamespaceDefaultLabelName`](https://kubernetes.io/docs/reference/labels-annotations-taints/#kubernetes-io-metadata-name) feature gate.
|
||||
|
||||
### `PrometheusRule` labelling
|
||||
|
||||
The best practice is to label the `PrometheusRule`s containing rule files with `role: alert-rules` as well as the name of the Prometheus object, `prometheus: example` in this case.
|
||||
Create a PrometheusRule object from the following manifest. Note that the
|
||||
object's labels match with the `spec.ruleSelector` of the Prometheus object.
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/alerting/prometheus-example-rules.yaml"
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
@ -271,26 +336,8 @@ spec:
|
|||
expr: vector(1)
|
||||
```
|
||||
|
||||
The example `PrometheusRule` always immediately triggers an alert, which is only for demonstration purposes. To validate that everything is working properly have a look at each of the Prometheus web UIs.
|
||||
For demonstration purposes, the PrometheusRule object always fires the
|
||||
`ExampleAlert` alert. To validate that everything is working properly, you can
|
||||
open again the Prometheus web interface and go to the Alerts page.
|
||||
|
||||
Use kubectl's proxy functionality to view the web UI without a Service.
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
kubectl proxy --port=8001
|
||||
```
|
||||
|
||||
Then the web UI of each Prometheus instance can be viewed, they both have a firing alert called `ExampleAlert`, as defined in the loaded alerting rules.
|
||||
|
||||
* http://localhost:8001/api/v1/proxy/namespaces/default/pods/prometheus-example-0:9090/alerts
|
||||
* http://localhost:8001/api/v1/proxy/namespaces/default/pods/prometheus-example-1:9090/alerts
|
||||
|
||||
Looking at the status page for "Runtime & Build Information" on the Prometheus web UI shows the discovered and active Alertmanagers that the Prometheus instance will fire alerts against.
|
||||
|
||||
* http://localhost:8001/api/v1/proxy/namespaces/default/pods/prometheus-example-0:9090/status
|
||||
* http://localhost:8001/api/v1/proxy/namespaces/default/pods/prometheus-example-1:9090/status
|
||||
|
||||
These show three discovered Alertmanagers.
|
||||
|
||||
Heading to the Alertmanager web UI now shows one active alert, although all Prometheus instances are firing it. [Configuring the Alertmanager](https://prometheus.io/docs/alerting/configuration/) further allows custom alert routing, grouping and notification mechanisms.
|
||||
Next open the Alertmanager web interface and check that it shows one active alert.
|
||||
|
|
|
@ -1,36 +1,72 @@
|
|||
<br>
|
||||
<div class="alert alert-info" role="alert">
|
||||
<i class="fa fa-exclamation-triangle"></i><b> Note:</b> Starting with v0.39.0, Prometheus Operator requires use of Kubernetes v1.16.x and up.<br><br>
|
||||
This documentation is for an alpha feature. For questions and feedback on the Prometheus OCS Alpha program, email <a href="mailto:tectonic-alpha-feedback@coreos.com">tectonic-alpha-feedback@coreos.com</a>.
|
||||
</div>
|
||||
---
|
||||
weight: 151
|
||||
toc: true
|
||||
title: Getting Started
|
||||
menu:
|
||||
docs:
|
||||
parent: user-guides
|
||||
lead: ""
|
||||
images: []
|
||||
draft: false
|
||||
description: Getting started guide
|
||||
---
|
||||
|
||||
# Prometheus Operator
|
||||
The Prometheus Operator's goal is to make running Prometheus on top of Kubernetes
|
||||
as easy as possible, while preserving Kubernetes-native configuration options.
|
||||
|
||||
[Operators](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) were introduced by CoreOS as a class of software that operates other software, putting operational knowledge collected by humans into software.
|
||||
This guide will show you how to deploy the Prometheus operator, set up a
|
||||
Prometheus instance, and configure metrics collection for a sample application.
|
||||
|
||||
The Prometheus Operator serves to make running Prometheus on top of Kubernetes as easy as possible, while preserving Kubernetes-native configuration options.
|
||||
{{< alert icon="👉" text="Prometheus Operator requires use of Kubernetes v1.16.x and up."/>}}
|
||||
|
||||
## Example Prometheus Operator manifest
|
||||
> Note: [Operators](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/)
|
||||
> were introduced by CoreOS as a class of software that operates other software,
|
||||
> putting operational knowledge collected by humans into software.
|
||||
|
||||
To follow this getting started you will need a Kubernetes cluster you have access to. This [example](../../bundle.yaml) describes a Prometheus Operator Deployment, and its required ClusterRole, ClusterRoleBinding, Service Account and Custom Resource Definitions.
|
||||
## Pre-requisites
|
||||
|
||||
## Related resources
|
||||
To follow this guide, you will need a Kubernetes cluster with admin permissions.
|
||||
|
||||
The Prometheus Operator introduces additional resources in Kubernetes to declare the desired state of a Prometheus and Alertmanager cluster as well as the Prometheus configuration. The resources it introduces are:
|
||||
## Installing the operator
|
||||
|
||||
The first step is to install the operator's Custom Resource Definitions (CRDs) as well
|
||||
as the operator itself with the required RBAC resources.
|
||||
|
||||
Run the following commands to install the CRDs and deploy the operator in the `default` namespace:
|
||||
|
||||
```bash
|
||||
LATEST=$(curl -s https://api.github.com/repos/prometheus-operator/prometheus-operator/releases/latest | jq -cr .tag_name)
|
||||
curl -sL https://github.com/prometheus-operator/prometheus-operator/releases/download/${LATEST}/bundle.yaml | kubectl create -f -
|
||||
```
|
||||
|
||||
It can take a few minutes for the operator to be up and running. You can check for completion with the following command:
|
||||
|
||||
```bash
|
||||
kubectl wait --for=condition=Ready pods -l app.kubernetes.io/name=prometheus-operator -n default
|
||||
```
|
||||
|
||||
The Prometheus Operator introduces custom resources in Kubernetes to declare
|
||||
the desired state of a Prometheus and Alertmanager cluster as well as the
|
||||
Prometheus configuration. For this guide, the resources of interest are:
|
||||
|
||||
* `Prometheus`
|
||||
* `Alertmanager`
|
||||
* `ServiceMonitor`
|
||||
* `PodMonitor`
|
||||
|
||||
> See the [Alerting guide](alerting.md) for more information about the `Alertmanager` resource, or the [Design document](../design.md) for an overview of all resources introduced by the Prometheus Operator.
|
||||
The `Prometheus` resource declaratively describes the desired state of a
|
||||
Prometheus deployment, while `ServiceMonitor` and `PodMonitor` resources
|
||||
describe the targets to be monitored by Prometheus.
|
||||
|
||||
The Prometheus resource declaratively describes the desired state of a Prometheus deployment, while a ServiceMonitor describes the set of targets to be monitored by Prometheus.
|
||||

|
||||
|
||||

|
||||
> Note: Check the [Alerting guide]({{< ref "alerting" >}}) for more information about the `Alertmanager` resource.
|
||||
|
||||
The Prometheus resource includes a field called `serviceMonitorSelector`, which defines a selection of ServiceMonitors to be used. By default and before the version `v0.19.0`, ServiceMonitors must be installed in the same namespace as the Prometheus instance. With the Prometheus Operator `v0.19.0` and above, ServiceMonitors can be selected outside the Prometheus namespace via the `serviceMonitorNamespaceSelector` field of the Prometheus resource.
|
||||
> Note: Check the [Design page]({{< ref "design" >}}) for an overview of all resources introduced by the Prometheus Operator.
|
||||
|
||||
First, deploy three instances of a simple example application, which listens and exposes metrics on port `8080`.
|
||||
## Deploying a sample application
|
||||
|
||||
First, let's deploy a simple example application with 3 replicas which listens
|
||||
and exposes metrics on port `8080`.
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/getting-started/example-app-deployment.yaml"
|
||||
apiVersion: apps/v1
|
||||
|
@ -55,7 +91,9 @@ spec:
|
|||
containerPort: 8080
|
||||
```
|
||||
|
||||
The ServiceMonitor has a label selector to select Services and their underlying Endpoint objects. The Service object for the example application selects the Pods by the `app` label having the `example-app` value. The Service object also specifies the port on which the metrics are exposed.
|
||||
Let's expose the application with a Service object which selects all the Pods
|
||||
with the `app` label having the `example-app` value. The Service object also
|
||||
specifies the port on which the metrics are exposed.
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/getting-started/example-app-service.yaml"
|
||||
kind: Service
|
||||
|
@ -72,7 +110,10 @@ spec:
|
|||
port: 8080
|
||||
```
|
||||
|
||||
This Service object is discovered by a ServiceMonitor, which selects in the same way. The `app` label must have the value `example-app`.
|
||||
Finally we create a ServiceMonitor object which selects all Service objects
|
||||
with the `app: example-app` label. The ServiceMonitor object also has a `team`
|
||||
label (in this case `team: frontend`) to identify which team is responsible for
|
||||
monitoring the application/service.
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/getting-started/example-app-service-monitor.yaml"
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
@ -89,11 +130,14 @@ spec:
|
|||
- port: web
|
||||
```
|
||||
|
||||
## Enable RBAC rules for Prometheus pods
|
||||
## Deploying Prometheus
|
||||
|
||||
If [RBAC](https://kubernetes.io/docs/reference/access-authn-authz/authorization/) authorization is activated, you must create RBAC rules for both Prometheus *and* Prometheus Operator. A ClusterRole and a ClusterRoleBinding for the Prometheus Operator were created in the example Prometheus Operator manifest above. The same must be done for the Prometheus Pods.
|
||||
If
|
||||
[RBAC](https://kubernetes.io/docs/reference/access-authn-authz/authorization/)
|
||||
authorization is activated on your cluster, you must first create the RBAC rules
|
||||
for the Prometheus service account beforehand.
|
||||
|
||||
Create a ClusterRole and ClusterRoleBinding for the Prometheus Pods:
|
||||
Apply the following manifests to create the service account and required ClusterRole/ClusterRoleBinding:
|
||||
|
||||
```yaml mdox-exec="cat example/rbac/prometheus/prometheus-service-account.yaml"
|
||||
apiVersion: v1
|
||||
|
@ -144,11 +188,18 @@ subjects:
|
|||
namespace: default
|
||||
```
|
||||
|
||||
For more information, see the [Prometheus Operator RBAC guide](../rbac.md).
|
||||
For more information, see the [Prometheus Operator RBAC guide]({{< ref "rbac" >}}).
|
||||
|
||||
## Include ServiceMonitors
|
||||
The Prometheus custom resource defines the characteristics of the underlying
|
||||
concrete StatefulSet (number of replicas, resource requests/limits , ...) as
|
||||
well as which ServiceMonitors should be included with the
|
||||
`spec.serviceMonitorSelector` field.
|
||||
|
||||
A Prometheus object defines the `serviceMonitorSelector` to specify which ServiceMonitors should be included. Above the label `team: frontend` was specified, so that's what the Prometheus object selects by.
|
||||
Previously, we have created the ServiceMonitor object with the `team: frontend`
|
||||
label and here we define that the Prometheus object should select all
|
||||
ServiceMonitors with the `team: frontend` label. This enables the frontend team
|
||||
to create new ServiceMonitors and Services without having to reconfigure the
|
||||
Prometheus object.
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/getting-started/prometheus-service-monitor.yaml"
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
@ -166,13 +217,39 @@ spec:
|
|||
enableAdminAPI: false
|
||||
```
|
||||
|
||||
> If you have RBAC authorization activated, use the RBAC aware [Prometheus manifest](../../example/rbac/prometheus/prometheus.yaml) instead.
|
||||
To verify that the instance is up and running, run:
|
||||
|
||||
This enables the frontend team to create new ServiceMonitors and Services which allow Prometheus to be dynamically reconfigured.
|
||||
```bash
|
||||
kubectl get -n default prometheus prometheus -w
|
||||
```
|
||||
|
||||
## Include PodMonitors
|
||||
By default, Prometheus will only pick up ServiceMonitors from the current
|
||||
namespace. To select ServiceMonitors from other namespaces, you can update the
|
||||
`spec.serviceMonitorNamespaceSelector` field of the Prometheus resource.
|
||||
|
||||
Finally, a Prometheus object defines the `podMonitorSelector` to specify which PodMonitors should be included. Above the label `team: frontend` was specified, so that's what the Prometheus object selects by.
|
||||
## Using PodMonitors
|
||||
|
||||
Instead of a ServiceMonitor, we can use a PodMonitor which doesn't require the
|
||||
creation of a Kubernetes Service. In practice, the `spec.selector` label tells
|
||||
Prometheus which Pods should be scraped.
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/getting-started/example-app-pod-monitor.yaml"
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: PodMonitor
|
||||
metadata:
|
||||
name: example-app
|
||||
labels:
|
||||
team: frontend
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
app: example-app
|
||||
podMetricsEndpoints:
|
||||
- port: web
|
||||
```
|
||||
|
||||
Similarly the Prometheus object defines which PodMonitors get selected with the
|
||||
`spec.podMonitorSelector` field.
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/getting-started/prometheus-pod-monitor.yaml"
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
@ -190,13 +267,10 @@ spec:
|
|||
enableAdminAPI: false
|
||||
```
|
||||
|
||||
> If you have RBAC authorization activated, use the RBAC aware [Prometheus manifest](../../example/rbac/prometheus/prometheus.yaml) instead.
|
||||
## Exposing the Prometheus service
|
||||
|
||||
This enables the frontend team to create new PodMonitors which allow Prometheus to be dynamically reconfigured.
|
||||
|
||||
## Expose the Prometheus instance
|
||||
|
||||
To access the Prometheus instance it must be exposed to the outside. This example exposes the instance using a Service of type `NodePort`.
|
||||
To access the Prometheus interface, you have to expose the service to the outside. For
|
||||
simplicity, we use a `NodePort` Service.
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/getting-started/prometheus-service.yaml"
|
||||
apiVersion: v1
|
||||
|
@ -215,18 +289,25 @@ spec:
|
|||
prometheus: prometheus
|
||||
```
|
||||
|
||||
Once this Service is created the Prometheus web UI is available under the node's IP address on port `30900`. The targets page in the web UI now shows that the instances of the example application have successfully been discovered.
|
||||
Once the Service is created, the Prometheus web server is available under the
|
||||
node's IP address on port `30900`. The Targets page in the web interface should
|
||||
show that the instances of the example application have successfully been
|
||||
discovered.
|
||||
|
||||
> Exposing the Prometheus web UI may not be an applicable solution. Read more about the possibilities of exposing it in the [exposing Prometheus and Alertmanager guide](exposing-prometheus-and-alertmanager.md).
|
||||
> Note: Exposing the Prometheus web server this way may not be an applicable solution. Read more about the possible options in the [Ingress guide](exposing-prometheus-and-alertmanager.md).
|
||||
|
||||
## Expose the Prometheus Admin API
|
||||
## Exposing the Prometheus Admin API
|
||||
|
||||
Prometheus Admin API allows access to delete series for a certain time range, cleanup tombstones, capture snapshots, etc. More information about the admin API can be found in [Prometheus official documentation](https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-admin-apis)
|
||||
This API access is disabled by default and can be toggled using this boolean flag. The following example exposes the admin API:
|
||||
Prometheus Admin API allows access to delete series for a certain time range,
|
||||
cleanup tombstones, capture snapshots, etc. More information about the admin
|
||||
API can be found in [Prometheus official
|
||||
documentation](https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-admin-apis)
|
||||
This API access is disabled by default and can be toggled using this boolean
|
||||
flag. The following example exposes the admin API:
|
||||
|
||||
> WARNING: Enabling the admin APIs enables mutating endpoints, to delete data,
|
||||
> shutdown Prometheus, and more. Enabling this should be done with care and the
|
||||
> user is advised to add additional authentication authorization via a proxy to
|
||||
> user is advised to add additional authentication/authorization via a proxy to
|
||||
> ensure only clients authorized to perform these actions can do so.
|
||||
|
||||
```yaml mdox-exec="cat example/user-guides/getting-started/prometheus-admin-api.yaml"
|
||||
|
@ -245,6 +326,6 @@ spec:
|
|||
enableAdminAPI: true
|
||||
```
|
||||
|
||||
Further reading:
|
||||
Next:
|
||||
|
||||
* [Alerting](alerting.md) describes using the Prometheus Operator to manage Alertmanager clusters.
|
||||
* [Alerting]({{< ref "alerting" >}}) describes using the Prometheus Operator to manage Alertmanager clusters.
|
||||
|
|
|
@ -1,15 +1,28 @@
|
|||
<br>
|
||||
<div class="alert alert-info" role="alert">
|
||||
<i class="fa fa-exclamation-triangle"></i><b> Note:</b> Starting with v0.39.0, Prometheus Operator requires use of Kubernetes v1.16.x and up.
|
||||
</div>
|
||||
---
|
||||
weight: 206
|
||||
toc: true
|
||||
title: Storage
|
||||
menu:
|
||||
docs:
|
||||
parent: operator
|
||||
lead: ""
|
||||
images: []
|
||||
draft: false
|
||||
description: Storage considerations
|
||||
---
|
||||
|
||||
# Storage
|
||||
By default, the operator configures Pods to store data on `emptyDir` volumes
|
||||
which aren't persisted when the Pods are redeployed. To maintain data across
|
||||
deployments and version upgrades, you can configure persistent storage for
|
||||
Prometheus, Alertmanager and ThanosRuler resources.
|
||||
|
||||
To maintain data across deployments and version upgrades, the data must be persisted to some volume other than `emptyDir`, allowing it to be reused by Pods after an upgrade.
|
||||
Kubernetes supports several kinds of storage volumes. The Prometheus Operator
|
||||
works with PersistentVolumeClaims, which support the underlying
|
||||
PersistentVolume to be provisioned when requested.
|
||||
|
||||
Kubernetes supports several kinds of storage volumes. The Prometheus Operator works with PersistentVolumeClaims, which support the underlying PersistentVolume to be provisioned when requested.
|
||||
|
||||
This document assumes a basic understanding of PersistentVolumes, PersistentVolumeClaims, and their [provisioning](https://kubernetes.io/docs/user-guide/persistent-volumes/#provisioning).
|
||||
This document assumes a basic understanding of PersistentVolumes,
|
||||
PersistentVolumeClaims, and their
|
||||
[provisioning](https://kubernetes.io/docs/user-guide/persistent-volumes/#provisioning).
|
||||
|
||||
## Storage Provisioning on AWS
|
||||
|
||||
|
@ -25,11 +38,19 @@ parameters:
|
|||
type: gp2
|
||||
```
|
||||
|
||||
> Make sure that AWS as a cloud provider is properly configured with your cluster, or storage provisioning will not work.
|
||||
> Note: Make sure that AWS as a cloud provider is properly configured with your cluster, or storage provisioning will not work.
|
||||
|
||||
For best results, use volumes that have high I/O throughput. These examples use SSD EBS volumes. Read the Kubernetes [Persistent Volumes](https://kubernetes.io/docs/user-guide/persistent-volumes/#aws) documentation to adapt this `StorageClass` to your needs.
|
||||
For best results, use volumes that have high I/O throughput. These examples use
|
||||
SSD EBS volumes. Read the Kubernetes [Persistent
|
||||
Volumes](https://kubernetes.io/docs/user-guide/persistent-volumes/#aws)
|
||||
documentation to adapt this `StorageClass` to your needs.
|
||||
|
||||
The `StorageClass` that was created can be specified in the `storage` section in the `Prometheus` resource (note that if you're using [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus), then instead of making the following change to your `Prometheus` resource, see the [prometheus-pvc.jsonnet](https://github.com/prometheus-operator/kube-prometheus/blob/main/examples/prometheus-pvc.jsonnet) example).
|
||||
The `StorageClass` that was created can be specified in the `storage` section
|
||||
in the `Prometheus` resource (note that if you're using
|
||||
[kube-prometheus](https://github.com/prometheus-operator/kube-prometheus), then
|
||||
instead of making the following change to your `Prometheus` resource, see the
|
||||
[prometheus-pvc.jsonnet](https://github.com/prometheus-operator/kube-prometheus/blob/main/examples/prometheus-pvc.jsonnet)
|
||||
example).
|
||||
|
||||
```yaml mdox-exec="cat example/storage/persisted-prometheus.yaml"
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
@ -46,17 +67,26 @@ spec:
|
|||
storage: 40Gi
|
||||
```
|
||||
|
||||
> The full documentation of the `storage` field can be found in the [API documentation](../api.md#monitoring.coreos.com/v1.StorageSpec).
|
||||
> The full documentation of the `storage` field can be found in the [API reference]({{< ref "api" >}}).
|
||||
|
||||
When creating the Prometheus object, a PersistentVolumeClaim is used for each Pod in the StatefulSet, and the storage should automatically be provisioned, mounted and used.
|
||||
When creating the Prometheus object, a PersistentVolumeClaim is used for each
|
||||
Pod in the StatefulSet, and the storage should automatically be provisioned,
|
||||
mounted and used.
|
||||
|
||||
The same approach should work with other cloud providers (GCP, Azure, ...) and
|
||||
any Kubernetes storage provider supporting dynamic provisioning.
|
||||
|
||||
## Manual storage provisioning
|
||||
|
||||
The Prometheus CRD specification allows you to support arbitrary storage through a PersistentVolumeClaim.
|
||||
The Prometheus CRD specification allows you to support arbitrary storage
|
||||
through a PersistentVolumeClaim.
|
||||
|
||||
The easiest way to use a volume that cannot be automatically provisioned (for whatever reason) is to use a label selector alongside a manually created PersistentVolume.
|
||||
The easiest way to use a volume that cannot be automatically provisioned (for
|
||||
whatever reason) is to use a label selector alongside a manually created
|
||||
PersistentVolume.
|
||||
|
||||
For example, using an NFS volume might be accomplished with the following specifications:
|
||||
For example, using an NFS volume might be accomplished with the following
|
||||
manifests:
|
||||
|
||||
```yaml
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
@ -66,7 +96,7 @@ metadata:
|
|||
labels:
|
||||
prometheus: example
|
||||
spec:
|
||||
...
|
||||
replicas: 1
|
||||
storage:
|
||||
volumeClaimTemplate:
|
||||
spec:
|
||||
|
@ -76,9 +106,7 @@ spec:
|
|||
resources:
|
||||
requests:
|
||||
storage: 50Gi
|
||||
|
||||
---
|
||||
|
||||
apiVersion: v1
|
||||
kind: PersistentVolume
|
||||
metadata:
|
||||
|
@ -97,11 +125,19 @@ spec:
|
|||
|
||||
### Disabling Default StorageClasses
|
||||
|
||||
To manually provision volumes (as of Kubernetes 1.6.0), you may need to disable the default StorageClass that is automatically created for certain Cloud Providers. Default StorageClasses are pre-installed on Azure, AWS, GCE, OpenStack, and vSphere.
|
||||
To manually provision volumes (as of Kubernetes 1.6.0), you may need to disable
|
||||
the default StorageClass that is automatically created for certain Cloud
|
||||
Providers. Default StorageClasses are pre-installed on Azure, AWS, GCE,
|
||||
OpenStack, and vSphere.
|
||||
|
||||
The default StorageClass behavior will override manual storage provisioning, preventing PersistentVolumeClaims from automatically binding to manually created PersistentVolumes.
|
||||
The default StorageClass behavior will override manual storage provisioning,
|
||||
preventing PersistentVolumeClaims from automatically binding to manually
|
||||
created PersistentVolumes.
|
||||
|
||||
To override this behavior, you must explicitly create the same resource, but set it to *not* be default. (See the [changelog](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.6.md#volumes) for more information.)
|
||||
To override this behavior, you must explicitly create the same resource, but
|
||||
set it to *not* be default (see the
|
||||
[changelog](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.6.md#volumes)
|
||||
for more information.)
|
||||
|
||||
For example, to disable default StorageClasses on a Google Container Engine cluster, create the following StorageClass:
|
||||
|
||||
|
@ -118,3 +154,48 @@ parameters:
|
|||
type: pd-ssd
|
||||
zone: us-east1-d
|
||||
```
|
||||
|
||||
## Resizing volumes
|
||||
|
||||
Even if the StorageClass supports resizing, Kubernetes doesn't support (yet)
|
||||
volume expansion through StatefulSets. This means that when you update the
|
||||
storage requests in the `spec.storage` field of a custom resource, it doesn't
|
||||
get propagated to the associated PVCs (more details in the [KEP
|
||||
issue](https://github.com/kubernetes/enhancements/issues/661)).
|
||||
|
||||
It is still possible to fix the situation manually.
|
||||
|
||||
First, update the storage request in the `spec.storage` field of the custom resource (assuming a Prometheus resource named `example`):
|
||||
|
||||
```yaml
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: Prometheus
|
||||
metadata:
|
||||
name: example
|
||||
spec:
|
||||
replicas: 1
|
||||
storage:
|
||||
volumeClaimTemplate:
|
||||
spec:
|
||||
resources:
|
||||
requests:
|
||||
storage: 10Gi
|
||||
```
|
||||
|
||||
Next, patch every PVC with the updated storage request:
|
||||
|
||||
```bash
|
||||
for p in $(kubectl get pvc -l operator.prometheus.io/name=example -o jsonpath='{range .items[*]}{.metadata.name} {end}'); do \
|
||||
kubectl patch pvc/${p} --patch '{"spec": {"resources": {"requests": {"storage":"10Gi"}}}}' \
|
||||
done
|
||||
```
|
||||
|
||||
Last, delete the underlying StatefulSet using the `orphan` deletion strategy:
|
||||
|
||||
```bash
|
||||
kubectl delete statefulset -l operator.prometheus.io/name=example --cascade=orphan
|
||||
```
|
||||
|
||||
The operator should recreate the StatefulSet immediately, there will be no
|
||||
service disruption thanks to the `orphan` strategy and the volumes mounted in
|
||||
the Pods should have the updated size.
|
||||
|
|
|
@ -1,16 +1,35 @@
|
|||
# Strategic merge patch
|
||||
---
|
||||
weight: 207
|
||||
toc: true
|
||||
title: Strategic Merge Patch
|
||||
menu:
|
||||
docs:
|
||||
parent: operator
|
||||
lead: ""
|
||||
images: []
|
||||
draft: false
|
||||
description: Using strategic merge patch to overwrite container definition.
|
||||
---
|
||||
|
||||
When users need to apply a specific configuration to containers that we do not support or do not currently exist, merge patch can be used.
|
||||
This document describes how to overwrite the container configuration generated by the operator by merging patches.
|
||||
This document describes how to overwrite the configuration generated by the
|
||||
operator using [strategic merge
|
||||
patches](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/update-api-object-kubectl-patch/#use-a-strategic-merge-patch-to-update-a-deployment).
|
||||
|
||||
## How the "strategic merge patch" works
|
||||
When users need to apply a specific configuration to the containers that is
|
||||
either not exposed in the custom resource definitions or already defined by
|
||||
the operator, strategic merge patch can be used.
|
||||
|
||||
The operator supports `containers` field in `PrometheusSpec`, `AlertmanagerSpec` and `ThanosRulerSpec` configuration.
|
||||
This field allows injecting additional containers, and the existing configuration can be overwritten by sharing the same container name.
|
||||
## How does it work?
|
||||
|
||||
### Merge patch example of Prometheus
|
||||
The `Prometheus`, `Alertmanager`, and `ThanosRuler` CRDs expose a
|
||||
`spec.containers` field which allows to:
|
||||
* Override fields for the containers generated by the operator.
|
||||
* Inject additional containers.
|
||||
|
||||
The following manifest overwrites the `failureThreshold` value of the readiness probe for the Prometheus container.
|
||||
## Merging patch for Prometheus
|
||||
|
||||
The following manifest overwrites the `failureThreshold` value of startup
|
||||
probe of the Prometheus container:
|
||||
|
||||
```yaml
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
@ -23,13 +42,14 @@ metadata:
|
|||
spec:
|
||||
containers:
|
||||
- name: prometheus
|
||||
readinessProbe:
|
||||
startupProbe:
|
||||
failureThreshold: 500
|
||||
```
|
||||
|
||||
### Merge patch example for Alertmanager
|
||||
## Merging patch for Alertmanager
|
||||
|
||||
The following manifest overwrites the `failureThreshold` values of the readiness and liveness probes for the Alertmanager container.
|
||||
The following manifest overwrites the `failureThreshold` values of the
|
||||
readiness and liveness probes for the Alertmanager container.
|
||||
|
||||
```yaml
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
|
|
|
@ -10,21 +10,8 @@ spec:
|
|||
groupWait: 30s
|
||||
groupInterval: 5m
|
||||
repeatInterval: 12h
|
||||
receiver: 'wechat-example'
|
||||
receiver: 'webhook'
|
||||
receivers:
|
||||
- name: 'wechat-example'
|
||||
wechatConfigs:
|
||||
- apiURL: 'http://wechatserver:8080/'
|
||||
corpID: 'wechat-corpid'
|
||||
apiSecret:
|
||||
name: 'wechat-config'
|
||||
key: 'apiSecret'
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
type: Opaque
|
||||
metadata:
|
||||
name: wechat-config
|
||||
data:
|
||||
apiSecret: d2VjaGF0LXNlY3JldAo=
|
||||
- name: 'webhook'
|
||||
webhookConfigs:
|
||||
- api: 'http://example.com/'
|
||||
|
|
|
@ -1,5 +1,3 @@
|
|||
global:
|
||||
resolve_timeout: 5m
|
||||
route:
|
||||
group_by: ['job']
|
||||
group_wait: 30s
|
||||
|
@ -9,4 +7,4 @@ route:
|
|||
receivers:
|
||||
- name: 'webhook'
|
||||
webhook_configs:
|
||||
- url: 'http://alertmanagerwh:30500/'
|
||||
- url: 'http://example.com/'
|
||||
|
|
6
scripts/docs/templates/pkg.tpl
vendored
6
scripts/docs/templates/pkg.tpl
vendored
|
@ -4,10 +4,8 @@ title: "API reference"
|
|||
description: "Prometheus operator generated API reference docs"
|
||||
draft: false
|
||||
images: []
|
||||
menu:
|
||||
docs:
|
||||
parent: "operator"
|
||||
weight: 208
|
||||
menu: "operator"
|
||||
weight: 210
|
||||
toc: true
|
||||
---
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue