node-feature-discovery

mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00

Author	SHA1	Message	Date
Markus Lehtonen	fc103a6028	Cleanup for NodeFeature API being GA Drop references to the gRPC API and don't suggest that NodeFeatureAPI could be disabled. Also update the developer guide for instructions running nfd components outside the cluster.	2024-12-13 15:40:46 +02:00
Markus Lehtonen	fb6484fb8d	deployment: add startupProbe for nfd-master This patch mitigates inadvertent termination of nfd-master pods by the liveness probe on big clusters. With a recent change nfd-master started to wait (block) for informer caches to sync before starting the main loop. Consequently, this change also made the gRPC health enpoint to not respond until the caches have been synced. In big clusters the syncing the NodeFeature object cache takes a long time as the objects are big and there's (at least) one per each node in the cluster. Thus, in big clusters, the liveness probe kicks in and kills the nfd-master pod before it's ready.	2024-12-12 20:00:49 +02:00
googs1025	e631a52374	chore: add metrics system prefix	2024-11-28 09:57:40 +08:00
Markus Lehtonen	45f49d574a	nfd-master: drop resourceLabels Drop the resourceLabels config file option and the corresponding -resource-labels command line flag. They were deprecated in NFD v0.13 so it's time to let them go. NodeFeatureRule(s) should be used to manage ERs, instead.	2024-11-07 15:16:52 +02:00
Markus Lehtonen	4bb91e2096	Deprecate separate metrics and health port args	2024-11-06 12:14:41 +02:00
Carlos Eduardo Arango Gutierrez	62f4eddce6	Drop support for hooks Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2024-11-04 14:50:07 +01:00
Kubernetes Prow Robot	b997ade5b3	Merge pull request #1942 from marquiz/devel/drop-grpc nfd-master: drop stale unreachable deprecation notices	2024-11-04 11:16:31 +01:00
Tobias Giese	52c2fc6498	Add separate helm values for the liveness and readiness probes Signed-off-by: Tobias Giese <tgiese@nvidia.com>	2024-10-18 12:54:42 +02:00
Tobias Giese	901fbe2866	Format helm.md Signed-off-by: Tobias Giese <tgiese@nvidia.com>	2024-10-18 12:54:42 +02:00
Markus Lehtonen	010393b302	docs: quote shell snippets containing urls with query parameters Makes them work with zsh which tries to glob URLs containing query parameters (question marks).	2024-10-02 17:07:32 +03:00
Tobias Giese	53ddf081da	Add parameter to configure health endpoint port Signed-off-by: Tobias Giese <tgiese@nvidia.com>	2024-09-24 15:15:50 +02:00
Tobias Giese	af0592b87c	Add helm values to configure hostNetwork and additional env vars We have to run our NFD workers in the host network. Also we need additional env variables such as KUBERNETES_SERVICE_HOST and _PORT. To achieve this we can simply add generic helm values. The default behavior is not changed. Signed-off-by: Tobias Giese <tgiese@nvidia.com>	2024-09-18 17:58:59 +02:00
Markus Lehtonen	843fc9307d	helm: rename args chart value to extraArgs The "args" value is not yet part of any release so this is not a breaking change.	2024-09-18 17:47:36 +03:00
Kubernetes Prow Robot	8ffe9f9997	Merge pull request #1807 from ArangoGutierrez/upgrade Add helm migration guide	2024-08-05 06:44:59 -07:00
Carlos Eduardo Arango Gutierrez	cb53f9f3c2	Add helm migration guide Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2024-07-23 16:20:45 +02:00
Omer Aplatony	b2222e2c8c	helm: add configurable liveness&readiness probes for master topology-updater and worker Signed-off-by: Omer Aplatony <omerap12@gmail.com>	2024-07-22 21:54:25 +03:00
Rouke Broersma	1230d607ac	Helm: Add revision history limit for worker daemonset (#1797 ) * Helm: Add revision history limit for worker daemonset Signed-off-by: Rouke Broersma <mobrockers@gmail.com> * Helm: Add revision history limit for topology updater daemonset Signed-off-by: Rouke Broersma <mobrockers@gmail.com> * chore: tidy table columns --------- Signed-off-by: Rouke Broersma <mobrockers@gmail.com>	2024-07-18 05:31:49 -07:00
Markus Lehtonen	efdf1b8bd9	docs: reformat tables of helm parameters Also correct the description of default value of master.tolerations.	2024-07-16 09:56:12 +03:00
Kubernetes Prow Robot	25ffe9c178	Merge pull request #1782 from omerap12/issue_1759 Helm: Add revision history limit for master replica	2024-07-15 01:09:09 -07:00
Omer Aplatony	920306cba8	Add revision history limit for master replica and for garbage collector Signed-off-by: Omer Aplatony <omerap12@gmail.com>	2024-07-12 18:20:38 +03:00
Markus Lehtonen	a269bf4d25	Drop the -enable-nodefeature-api flag Was marked to be removed in v0.17.	2024-07-10 15:20:07 +03:00
Kubernetes Prow Robot	d2456e181a	Merge pull request #1726 from marquiz/devel/helm-cmdline-args deployment/helm: enable specifying additional cmdline args	2024-07-09 02:09:52 -07:00
Markus Lehtonen	6515990cae	docs: describe Kubernetes version compatibility in versions page Bump the required Kubernetes version to v1.24. In practice this is the minimum Kubernetes version as our deployment (both kustomize and Helm) depend on the gRPC container probes feature of Kubernetes.	2024-07-08 15:28:25 +03:00
budimanjojo	3d62382cd1	helm: remove defaults CPU limits Signed-off-by: budimanjojo <budimanjojo@gmail.com>	2024-05-30 11:55:34 +07:00
Markus Lehtonen	a088de7333	deployment/helm: enable specifying additional cmdline args	2024-05-28 20:09:08 +03:00
Markus Lehtonen	28c852c9bd	docs/helm: document all feature gates Also, small correction to the description of the featureGates.NodeFeatureAPI parameter.	2024-05-24 16:02:31 +03:00
Markus Lehtonen	560bd11d85	Re-add -enable-nodefeature-api cmdline flag Bring back the -enable-nodefeature-api command line flag and the corresponding enableNodeFeatureApi helm config value that were removed without deprecation when the NodeFeatureAPI feature gate was introduced. The thinking behind this change is to not break existing users (without warning) unless totally unavoidable. Now the -enable-nodefeature-api flag is marked as deprecated and slated for removal in NFD v0.17. The NodeFeatureAPI feature gate and the -enable-nodefeature-api flag work together so that the NodeFeature API is disabled (gRPC is enabled, instead) if either of them is set to false. This patch selectively reverts parts of `06c4733bc5`.	2024-05-16 10:53:49 +03:00
Kubernetes Prow Robot	391865bbb2	Merge pull request #1651 from cmontemuino/doc-resource-limits docs: document trade-offs in memory configuration	2024-04-25 06:41:29 -07:00
Kubernetes Prow Robot	af8a41cc02	Merge pull request #1639 from TessaIO/chore-add-prometheus-pod-monitor-interval chore/deploy: make interval property in PodMonitor configurable	2024-04-05 03:03:26 -07:00
Carlos M	cc53b604c5	chore: include suggestions from code review Co-authored-by: Carlos Eduardo Arango Gutierrez <arangogutierrez@gmail.com>	2024-04-05 10:01:08 +02:00
cmontemuino	54b01a2576	docs: document trade-offs in memory configuration Problem: memory requests and limits has been set for `master` process in PR #1631. It does not follow best practices for setting those values, but the intention was provide default values for a wide variety of clusters, including small ones. Solution: provide solid documentation about the problems that might happen in production environments when `resource.memory.requests << resource.memory.limits`. Add a link to relevant external sources, which includes the advise from Tim Hockin: > Always set memory limit == request Signed-off-by: cmontemuino <1761056+cmontemuino@users.noreply.github.com>	2024-04-02 19:01:50 +02:00
Kubernetes Prow Robot	7938e81c33	Merge pull request #1631 from TessaIO/chore-add-resources-limits-and-requests chore/deployment: add resources requests and limits for helm and Kustomize	2024-04-02 02:03:59 -07:00
TessaIO	74153e11b5	chore/deploy: make interval property in PodMonitor configurable Signed-off-by: TessaIO <ahmedgrati1999@gmail.com>	2024-03-26 08:36:52 +01:00
TessaIO	d02414cf61	chore/deployment: add resources requests and limits for helm and Kustomize Signed-off-by: TessaIO <ahmedgrati1999@gmail.com>	2024-03-22 14:27:44 +01:00
Markus Lehtonen	6f891ce1d2	Remove references to -enable-nodefeature-api flag Fix documentation, code and e2e-tests.	2024-03-18 16:06:25 +02:00
Carlos Eduardo Arango Gutierrez	06c4733bc5	Add FeatureGate framework to handle new features Code inspired on https://github.com/kubernetes/component-base/tree/master/featuregate Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2024-03-15 19:11:32 +01:00
Allen Mun	8bd52594ab	add ability to use a custom issuer	2024-02-27 12:14:43 -05:00
Carlos Eduardo Arango Gutierrez	75f0a14f2a	helm: add priorityClassName option Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2024-02-15 16:29:33 +01:00
leemingeer	b6d8ce7a5a	nfd-topology-updater add pods fingerprint by default	2024-01-26 17:55:34 +08:00
Markus Lehtonen	09b5af74de	deployment/kustomize: drop the sample cert-manager overlay Drop the deprecated and broken sample overlay. This was an example for enabling TLS with cert-manager. However, the overlay has been broken (and useless) since NodeFeature API was enabled by default - and gRPC disabled - in v0.14.	2024-01-03 21:13:15 +02:00
Markus Lehtonen	889fffd7d4	helm: add post-delete hook that cleans up the node This patch adds a post-delete hook to the Helm chart that runs "nfd-master --prune" in the cluster. This cleans up the node of labels, annotations, taints and extended resources that were created by NFD.	2023-12-29 15:36:41 +02:00
Markus Lehtonen	6471a1f185	docs: second fix to the prometheus kustomize overlay name	2023-12-21 18:40:14 +02:00
Markus Lehtonen	08a12eb213	docs: fix name of prometheus kustomize overlay	2023-12-21 17:58:01 +02:00
Markus Lehtonen	f49e0a43c0	docs: use default instead of minimal image variant	2023-12-20 23:48:34 +02:00
Markus Lehtonen	53f5967555	deployment/kustomize: drop default-combined overlay The "combined" overlay, deploying nfd-master and nfd-worker in the same pod (with a daemonset) doesn't make sense anymore as we have enabled NodeFeature API. There is no direct communication between nfd-master and nfd-worker anymore, Moreover, the combined deployment can be seen as broken as there is one NodeFeature controller (i.e. nfd-master) on each node, causing them to race against each other, all processing all NodeFeature objects.	2023-12-08 14:42:31 +02:00
Kubernetes Prow Robot	bdfef6df18	Merge pull request #1485 from marquiz/devel/docs-deployment docs: remove outdated instructions for minimal image	2023-12-01 17:10:24 +01:00
Markus Lehtonen	e608fdac19	Change the base image of full image variant to Debian Bookworm	2023-12-01 16:38:41 +02:00
Markus Lehtonen	7ebf5c02c7	docs: remove outdated instructions for minimal image The "minimal" image variant has been the default since v0.13.	2023-12-01 16:30:55 +02:00
Markus Lehtonen	15dc917ddb	docs: streamline language	2023-12-01 15:57:53 +02:00
Markus Lehtonen	da64884d02	docs: drop "currently" All the documentation describes the current version of NFD (it not stated otherwise).	2023-12-01 15:47:18 +02:00

1 2 3

111 commits