Merge pull request #114 from brancz/exposing-metrics

Documentation: exposing metrics
2025-04-21 03:38:43 +00:00 · 2017-01-24 16:28:02 +01:00 · 2017-01-24 16:28:02 +01:00 · baf9688008
commit baf9688008
parent 27e63887e6 d81267eadf
4 changed files with 83 additions and 24 deletions
--- a/Documentation/alertmanager.md
+++ b/Documentation/alertmanager.md
@ -51,12 +51,12 @@ A healthy node would be one that has joined the existing mesh network and has
 been communicated the state that it missed while that particular instance was
 down for the upgrade.

-Currently there is no way to tell whether an Alertmanger instance is healthy
+Currently there is no way to tell whether an Alertmanager instance is healthy
 under the above conditions. There are discussions of using vector clocks to
 resolve merges in the above mentioned situation, and ensure on a best effort
 basis that joining the network was successful.

-> Note that single instance Alertmanger setups will therefore not have zero
+> Note that single instance Alertmanager setups will therefore not have zero
 > downtime on deployments.

 The current implementation of rolling deployments simply decides based on the
--- a/Documentation/exposing-metrics.md
+++ b/Documentation/exposing-metrics.md
@ -0,0 +1,48 @@
+# Exposing Metrics
+
+There are a number of
+[applications](https://prometheus.io/docs/instrumenting/exporters/#directly-instrumented-software)
+that are natively instrumented with Prometheus metrics, those applications
+simply expose the metrics through an HTTP server.
+
+The Prometheus developers and the community are maintaining [client
+libraries](https://prometheus.io/docs/instrumenting/clientlibs/#client-libraries)
+for various languages. If you want to monitor your own applications and
+instrument them natively, chances are there is already a client library for
+your language.
+
+Not all software is natively instrumented with Prometheus metrics, however, do
+record metrics in some other form. For these kinds of applications there are so
+called
+[exporters](https://prometheus.io/docs/instrumenting/exporters/#third-party-exporters).
+
+Exporters can generally be divided into two categories:
+
+* Instance exporters: These expose metrics about a single instance of an
+  application. For example the HTTP requests that a single HTTP server has
+exporters served. These exporters are deployed as a
+[side-car](http://blog.kubernetes.io/2015/06/the-distributed-system-toolkit-patterns.html)
+container in the same pod as the actual instance of the respective application.
+A real life example is the [`dnsmasq` metrics
+sidecar](https://github.com/kubernetes/dns/blob/master/docs/sidecar/README.md),
+which converts the proprietary metrics format communicated over the DNS
+protocol by dnsmasq to the Prometheus exposition format and exposes it on an
+HTTP server.
+
+* Cluster-state exporters: These expose metrics about an entire system, they
+  could be native to the environment the application constructs. For example
+these could be the number 3D objects in a game, or metrics about a Kubernetes
+deployment. These exporters are typically deployed as a normal Kubernetes
+deployment, but can vary depending on the nature of the particular exporter. A
+real life example of this is the
+[`kube-state-metrics`](https://github.com/kubernetes/kube-state-metrics)
+exporter, which exposes metrics about the cluster state of a Kubernetes
+cluster.
+
+Lastly in some cases it is not a viable option to expose metrics via an HTTP
+server. For example a `CronJob` may only run for a few seconds - not long
+enough for Prometheus to be able to scrape the HTTP endpoint. The Pushgateway
+was developed to be able to collect metrics in a scenarion like that, however,
+it is highly recommended to not use the Pushgateway if possible. Read more
+about when to use the Pushgateway and alternative strategies here:
+https://prometheus.io/docs/practices/pushing/#should-i-be-using-the-pushgateway .
--- a/Documentation/prometheus.md
+++ b/Documentation/prometheus.md
@ -111,6 +111,7 @@ it brought up as data sources in potential Grafana deployments.
 Prometheus instances are deployed with default values for requested and maximum
 resource usage of CPU and memory. This will be made configurable in the `Prometheus` 
 TPR eventually.
+
 Prometheus comes with a variety of configuration flags for its storage engine that
 have to be tuned for better performance in large Prometheus servers. It will be the
 operators job to tune those correctly to be aligned with the experiences load
--- a/Documentation/service-monitor.md
+++ b/Documentation/service-monitor.md
@ -2,18 +2,41 @@

 The `ServiceMonitor` third party resource (TPR) allows to declaratively define
 how a dynamic set of services should be monitored. Which services are selected
-to be monitored with the desired configuration is defined using label selections.
-This allows to dynamically express monitoring without having to update additional
-configuration for services that follow known monitoring patterns.
+to be monitored with the desired configuration is defined using label
+selections. This allows an organization to introduce convensions around how
+metrics are exposed, and then following these conventions new services are
+automatically discovered, without the need to reconfigure the system.

-A service may expose one or more service ports, which are backed by a list
-of multiple endpoints that point to a pod in the common case.
+## Design

-In the `endpoints` section of the TPR, we can configure which ports of these
-endpoints we want to scrape for metrics and with which paramters. For advanced use
-cases one may want to monitor ports of backing pods, which are not directly part
-of the service endpoints. This is also made possible by the Prometheus Operator.
+For Prometheus to monitor any application within Kubernetes an `Endpoints`
+object needs to exist. `Endpoints` objects are essentially lists of IP
+addresses. Typically an `Endpoints` object is populated by a `Service` object.
+A `Service` object discovers `Pod`s by a label selector and adds those to the
+`Endpoints` object.

+A `Service` may expose one or more service ports, which are backed by a list of
+multiple endpoints that point to a `Pod` in the common case. This is reflected
+in the respective `Endpoints` object as well.
+
+The `ServiceMonitor` object introduced by the Prometheus Operator in turn
+discovers those `Endpoints` objects and configures Prometheus to monitor those
+`Pod`s.
+
+The `endpoints` section of the `ServiceMonitorSpec`, is used to configure which
+ports of these `Endpoints` are going to be scraped for metrics, and with which
+parameters. For advanced use cases one may want to monitor ports of backing
+`Pod`s, which are not directly part of the service endpoints. Therefore when
+specifying an endpoint in the `endpoints` section, they are strictly used.
+
+> Note: `endpoints` (lowercase) is the TPR field, while `Endpoints`
+> (capitalized) is the Kubernetes object kind.
+
+While `ServiceMonitor`s must live in the same namespace as the `Prometheus`
+TPR, discovered targets may come from any namespace. This is important to allow
+cross-namespace monitoring use cases, e.g. for meta-monitoring. Using the
+`namespaceSelector` of the `ServiceMonitorSpec`, one can restrict the
+namespaces the `Endpoints` objects are allowed to be discovered from.

 ## Specification

@ -61,16 +84,3 @@ of the service endpoints. This is also made possible by the Prometheus Operator.
 | any | Match any namespace | false | bool | false |
 | matchNames | Explicit list of namespace names to select | false | string array | |

-
-## Current state and roadmap
-
-### Namespaces
-
-While `ServiceMonitor`s must live in the same namespace as the `Prometheus` TPR,
-discovered targets may come from any namespace. This is important to allow cross-namespace
-monitoring use cases, e.g. for meta-monitoring.
-
-Currently, targets are always discovered from all namespaces. In the future, the
-`ServiceMonitor` should allow to restrict this to one or more namespaces.
-How such a configuration would look like, i.e. explicit namespaces, selection by labels,
-or both, and what the default behavior should be is still up for discussion.