node-feature-discovery

mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00

Author	SHA1	Message	Date
Markus Lehtonen	fb6484fb8d	deployment: add startupProbe for nfd-master This patch mitigates inadvertent termination of nfd-master pods by the liveness probe on big clusters. With a recent change nfd-master started to wait (block) for informer caches to sync before starting the main loop. Consequently, this change also made the gRPC health enpoint to not respond until the caches have been synced. In big clusters the syncing the NodeFeature object cache takes a long time as the objects are big and there's (at least) one per each node in the cluster. Thus, in big clusters, the liveness probe kicks in and kills the nfd-master pod before it's ready.	2024-12-12 20:00:49 +02:00
TessaIO	d02414cf61	chore/deployment: add resources requests and limits for helm and Kustomize Signed-off-by: TessaIO <ahmedgrati1999@gmail.com>	2024-03-22 14:27:44 +01:00
Markus Lehtonen	a053efda64	nfd-master: run a separate gRPC health server This patch separates the gRPC health server from the deprecated gRPC server (disabled by default, replaced by the NodeFeature CRD API) used for node labeling requests. The new health server runs on hardcoded TCP port number 8082. The main motivation for this change is to make the Kubernetes' built-in gRPC liveness probes to function if TLS is enabled (as they don't support TLS). The health server itself is a naive implementation (as it was before), basically only checking that nfd-master has started and hasn't crashed. The patch adds a TODO note to improve the functionality.	2024-01-04 13:58:26 +02:00
Markus Lehtonen	9624d182ab	deployment/kustomize: drop nfd-master service Not needed anymore as we're not relying on gRPC anymore.	2023-12-08 14:53:23 +02:00
Muyassarov, Feruzjon	06036a62ce	Replace gRPC health probe utility with k8s built-in health probe Kubernetes 1.23 has introduced native health probes for gRPC which can replace grpc_health_probe utility. This commit removes baking in grpc_health_probe binary into the image and updates related health checks to use k8s native gRPC. Signed-off-by: Muyassarov, Feruzjon <feruzjon.muyassarov@intel.com>	2023-09-20 12:25:36 +03:00
Carlos Eduardo Arango Gutierrez	e3aedd33e2	Enable metrics via prometheus operator Expose metrics via prometheus.monitoring.coreos.com/v1 The exposed metrics are \| Metric \| Type \| Meaning \| \| --------------- \| ---------------- \| ---------------- \| \| `nfd_master_build_info` \| Gauge \| Version from which nfd-master was built. \| \| `nfd_worker_build_info` \| Gauge \| Version from which nfd-worker was built. \| \| `nfd_updated_nodes` \| Counter \| Time taken to label a node \| \| `nfd_crd_processing_time` \| Gauge \| Time taken to process a NodeFeatureRule CRD \| \| `nfd_feature_discovery_duration_seconds` \| HistogramVec \| Time taken to discover features on a node \| Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>	2023-07-21 10:59:52 +02:00
Markus Lehtonen	457fc8483b	deployment/kustomize: use a named port for nfd gRPC service	2023-06-06 21:00:42 +03:00
AhmedGrati	3fff409f6d	Add master config file Similar to the nfd-worker, in this PR we want to support the dynamic run-time configurability through a config file for the nfd-master. We'll use a json or yaml configuration file along with the fsnotify in order to watch for changes in the config file. As a result, we're allowing dynamic control of logging params, allowed namespaces, extended resources, label whitelisting, and denied namespaces. Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-04-03 09:52:09 +01:00
AhmedGrati	743c877ad8	deployment: disable service links in NFD master pod Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-01-27 16:55:18 +01:00
Carlos Eduardo Arango Gutierrez	dece85b394	Add livenessProbe via grpc to nfd-master Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>	2021-08-18 10:23:10 -05:00
Markus Lehtonen	8117c099a3	deployment: add kustomize base Implement functionality virtually replicating deployment templates for nfd-master and nfd-worker daemonset (nfd-master.yaml.template and nfd-worker-daemonset.yaml.template) by adding a kustomize overlay named "default". We split the resources into multiple bases (rbac, master and worker-daemonset) so that relevant parts are re-usable in other deployment scenarios added later (e.g. "one-shot job", and "combined daemonset"). This patch adds one component (components/common) doing the required kustomization for the example deployment.	2021-08-18 14:05:57 +03:00

11 commits