1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2025-03-16 13:28:18 +00:00
node-feature-discovery/docs/deployment/metrics.md
Markus Lehtonen 5ad2294c14 metrics: add nfd_node_update_requests_total counter
Add a counter for total number of node update/sync requests. In
practice, this counts the number of gRPC requests received if the gRPC
API is in use. If the NodeFeature API is enabled, this counts the
requests initiated by the NFD API controller, i.e. updates triggered by
changes in NodeFeature or NodeFeatureRule objects plus updates initiated
by the controller resync period.
2023-08-07 09:37:29 +03:00

2.5 KiB

title layout sort
Metrics default 7

Metrics

Metrics are configured to be exposed using prometheus operator API's by default. If you want to expose metrics using the prometheus operator API's you need to install the prometheus operator in your cluster. By default NFD Master and Worker expose metrics on port 8081.

The exposed metrics are

Metric Type Description
nfd_master_build_info Gauge Version from which nfd-master was built
nfd_worker_build_info Gauge Version from which nfd-worker was built
nfd_node_update_requests_total Counter Number of node update requests processed by the master
nfd_node_updates_total Counter Number of nodes updated
nfd_node_update_failures_total Counter Number of nodes update failures
nfd_node_labels_rejected_total Counter Number of nodes labels rejected by nfd-master
nfd_node_extendedresources_rejected_total Counter Number of nodes extended resources rejected by nfd-master
nfd_node_taints_rejected_total Counter Number of nodes taints rejected by nfd-master
nfd_nodefeaturerule_processing_duration_seconds Histogram Time taken to process NodeFeatureRule objects
nfd_nodefeaturerule_processing_errors_total Counter Number or errors encountered while processing NodeFeatureRule objects
nfd_feature_discovery_duration_seconds Histogram Time taken to discover features on a node

Via Kustomize

To deploy NFD with metrics enabled using kustomize, you can use the Metrics Overlay.

Via Helm

By default metrics are enabled when deploying NFD via Helm. To enable Prometheus to scrape metrics from NFD, you need to pass the following values to Helm:

--set prometheus.enable=true

For more info on Helm deployment, see Helm.

We recommend setting --set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false when deploying prometheus-operator via Helm to enable the prometheus-operator to scrape metrics from any PodMonitor.

or setting labels on the PodMonitor via the helm parameter prometheus.labels to control which Prometheus instances will scrape this PodMonitor.