1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00
Commit graph

456 commits

Author SHA1 Message Date
Kubernetes Prow Robot
3e87c97ac2
Merge pull request #1976 from marquiz/devel/grpc-api-cleanup
Cleanup for NodeFeature API being GA
2024-12-13 15:14:26 +01:00
Markus Lehtonen
fc103a6028 Cleanup for NodeFeature API being GA
Drop references to the gRPC API and don't suggest that NodeFeatureAPI
could be disabled.

Also update the developer guide for instructions running nfd components
outside the cluster.
2024-12-13 15:40:46 +02:00
Kubernetes Prow Robot
caaac59eba
Merge pull request #1860 from ozhuraki/no-owner-refs
nfd-worker: Add an option to disable setting the owner references
2024-12-13 13:12:26 +01:00
Markus Lehtonen
fb6484fb8d deployment: add startupProbe for nfd-master
This patch mitigates inadvertent termination of nfd-master pods by the
liveness probe on big clusters.

With a recent change nfd-master started to wait (block) for informer
caches to sync before starting the main loop. Consequently, this change
also made the gRPC health enpoint to not respond until the caches have
been synced. In big clusters the syncing the NodeFeature object cache
takes a long time as the objects are big and there's (at least) one per
each node in the cluster. Thus, in big clusters, the liveness probe
kicks in and kills the nfd-master pod before it's ready.
2024-12-12 20:00:49 +02:00
Oleg Zhurakivskyy
20ef877ab1 nfd-worker: Add an option to disable setting the owner references
In some cases it's desirable to control automatic garbage collection
of NodeFeature object.

Add an option to disable setting the owner references to Pod
for NodeFeature object.

Closes: 1817

Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2024-11-28 16:50:10 +02:00
Kubernetes Prow Robot
443913e019
Merge pull request #1956 from googs1025/chore/add_metrics_prefix
chore: add metrics system prefix
2024-11-28 09:00:59 +00:00
googs1025
e631a52374 chore: add metrics system prefix 2024-11-28 09:57:40 +08:00
Oleg Zhurakivskyy
fb52206b96 Detect AMXFP8 cpuid feature
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2024-11-26 14:39:18 +02:00
Kubernetes Prow Robot
835832729f
Merge pull request #1951 from marquiz/devel/nodefeature-ga
docs: minor update in the feature gates table
2024-11-08 08:50:43 +00:00
Markus Lehtonen
1244a42030 docs: minor update in the feature gates table 2024-11-07 15:27:51 +02:00
Markus Lehtonen
45f49d574a nfd-master: drop resourceLabels
Drop the resourceLabels config file option and the corresponding
-resource-labels command line flag. They were deprecated in NFD v0.13 so
it's time to let them go. NodeFeatureRule(s) should be used to manage
ERs, instead.
2024-11-07 15:16:52 +02:00
Kubernetes Prow Robot
61ce3b3ce3
Merge pull request #1948 from marquiz/devel/deprecate-separate-ports
Deprecate separate metrics and health port args
2024-11-06 11:05:29 +00:00
Markus Lehtonen
4bb91e2096 Deprecate separate metrics and health port args 2024-11-06 12:14:41 +02:00
Kubernetes Prow Robot
955095c7eb
Merge pull request #1889 from ChaoyiHuang/fixtiltup
Doc: Fix tilt up issue in feature discovering in developer guide
2024-11-06 10:03:30 +00:00
Carlos Eduardo Arango Gutierrez
62f4eddce6
Drop support for hooks
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2024-11-04 14:50:07 +01:00
Kubernetes Prow Robot
65b5e0c255
Merge pull request #1944 from ArangoGutierrez/I/1733
Taints: mark stable
2024-11-04 12:49:29 +00:00
Carlos Eduardo Arango Gutierrez
dc7edd50ba
Taints: mark stable
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2024-11-04 11:30:09 +01:00
Kubernetes Prow Robot
b997ade5b3
Merge pull request #1942 from marquiz/devel/drop-grpc
nfd-master: drop stale unreachable deprecation notices
2024-11-04 11:16:31 +01:00
Chaoyi Huang
d08ea5ee11 Doc: Fix tilt up issue in feature discovering in developer guide
The issue is due to the k3d/kind cluster created by ctlptl will run
inside containers(it will serve as the virtual hosts).

Host folders which will be scaned by the nfd feature discovery should
be mounted into the container ( the virtual host). otherwise the nfd-worker
container which run inside the virtual host will just see the default base
image rootfs /boot, /lib folders, which are usually empty, leads to the
discovey failure.

Signed-off-by: Chaoyi Huang <joehuang.sweden@gmail.com>
2024-11-01 02:31:23 +00:00
Kubernetes Prow Robot
fd2893e2a5
Merge pull request #1592 from AhmedThresh/feat-configure-cr-restrictions
feat/nfd-master: configure CR restrictions
2024-10-24 12:20:54 +01:00
Tobias Giese
52c2fc6498
Add separate helm values for the liveness and readiness probes
Signed-off-by: Tobias Giese <tgiese@nvidia.com>
2024-10-18 12:54:42 +02:00
Tobias Giese
901fbe2866
Format helm.md
Signed-off-by: Tobias Giese <tgiese@nvidia.com>
2024-10-18 12:54:42 +02:00
Markus Lehtonen
010393b302 docs: quote shell snippets containing urls with query parameters
Makes them work with zsh which tries to glob URLs containing query
parameters (question marks).
2024-10-02 17:07:32 +03:00
Tobias Giese
53ddf081da
Add parameter to configure health endpoint port
Signed-off-by: Tobias Giese <tgiese@nvidia.com>
2024-09-24 15:15:50 +02:00
Tobias Giese
af0592b87c
Add helm values to configure hostNetwork and additional env vars
We have to run our NFD workers in the host network.
Also we need additional env variables such as KUBERNETES_SERVICE_HOST and _PORT.
To achieve this we can simply add generic helm values. The default behavior is not changed.

Signed-off-by: Tobias Giese <tgiese@nvidia.com>
2024-09-18 17:58:59 +02:00
Markus Lehtonen
843fc9307d helm: rename args chart value to extraArgs
The "args" value is not yet part of any release so this is not a
breaking change.
2024-09-18 17:47:36 +03:00
AhmedGrati
28b40c90b8 deploy: add CR restrictions to the helm config
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
Signed-off-by: AhmedThresh <ahmed.grati@insat.ucar.tn>
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
Signed-off-by: AhmedThresh <ahmed.grati@insat.ucar.tn>
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
Signed-off-by: AhmedThresh <ahmed.grati@insat.ucar.tn>
2024-09-16 16:02:42 +02:00
Kazuki Suda
7f6669eb92
source/system: Add reading product name information 2024-09-10 14:42:08 +09:00
Markus Lehtonen
02b6b7395c Drop dynamic run-time reconfiguration
Simplify the code and reduce possible error scenarios by dropping
fsnotify-based reconfiguration from nfd-master and nfd-worker. Also
eliminates repeated re-configuration in scenarios where kubelet
continuosly touches the (every minute) mounted file (configmap) on the
filesystem.

Also modifies the Helm and kustomize deployments so that nfd-master,
nfd-worker and nfd-topology-updater pods are restarted on configmap
updates. In kustomize, the slght downside of this is the name of the
config map(s) depends on the content, so every time a user customizes
the config data, the old unused configmap will be left and must be
garbage-collected manually.
2024-08-21 12:46:36 +03:00
AhmedGrati
925a071595 docs: add CR restrictions to the master configuration reference
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2024-08-10 22:39:14 +02:00
Markus Lehtonen
b2bc18f5a5 docs: use jekyll-rtd-theme from a ruby gem
The upstream repo (and the release downloads)
github.com/rundocs/jekyll-rtd-theme has been deleted. This broke our
docs generation as the remote theme configuration depended on
downloading the release artefact.

This patch changes the docs building to use a Ruby gem instead of the
remote theme setting. To complicate matters, the gem has an seemingly
incorrect (too strict) version dependency. To mitigate this, we now
install bundler-override plugin to ignore this particular dependency.

The netlify conf is a hack, but I wasn't able to figure out a way how to
install the bundler-override plugin without doing all ruby
initialization in the build command.
2024-08-08 23:33:37 +03:00
Kubernetes Prow Robot
8ffe9f9997
Merge pull request #1807 from ArangoGutierrez/upgrade
Add helm migration guide
2024-08-05 06:44:59 -07:00
joehuang
a442749f89 Docs: Fix the link to feature gates documentation
The link to feature gates documentation is pointing to the
feature-gates.md in master-commandline-reference.html and
worker-commandline-reference.html, it should be updated to
linking html file.

Signed-off-by: joehuang <joehuang.sweden@gmail.com>
2024-08-01 09:37:10 +00:00
joehuang
efd2bac490 Fix the link to feature gates documentation
The link to feature gates documentation is pointing to the
upward folder in master-commandline-reference.md, it should
be updated to linking file in the same folder.

Signed-off-by: joehuang <joehuang.sweden@gmail.com>
2024-08-01 01:15:03 +00:00
Omer Aplatony
b7c18b949d Docs: Fixed featue-gates reference
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-07-29 17:34:03 +03:00
Carlos Eduardo Arango Gutierrez
cb53f9f3c2
Add helm migration guide
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2024-07-23 16:20:45 +02:00
Omer Aplatony
b2222e2c8c helm: add configurable liveness&readiness probes for master topology-updater and worker
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-07-22 21:54:25 +03:00
Rouke Broersma
1230d607ac
Helm: Add revision history limit for worker daemonset (#1797)
* Helm: Add revision history limit for worker daemonset

Signed-off-by: Rouke Broersma <mobrockers@gmail.com>

* Helm: Add revision history limit for topology updater daemonset

Signed-off-by: Rouke Broersma <mobrockers@gmail.com>

* chore: tidy table columns

---------

Signed-off-by: Rouke Broersma <mobrockers@gmail.com>
2024-07-18 05:31:49 -07:00
Markus Lehtonen
25e827a4c8 feature-gates: mark NodeFeatureAPI as GA
The feature gate is locked to true. That is, it is not possible to revert
back to the gPRC-based communication which makes the gRPC API ready for
removal.
2024-07-16 13:53:31 +03:00
Markus Lehtonen
efdf1b8bd9 docs: reformat tables of helm parameters
Also correct the description of default value of master.tolerations.
2024-07-16 09:56:12 +03:00
Kubernetes Prow Robot
25ffe9c178
Merge pull request #1782 from omerap12/issue_1759
Helm: Add revision history limit for master replica
2024-07-15 01:09:09 -07:00
Omer Aplatony
920306cba8 Add revision history limit for master replica and for garbage collector
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-07-12 18:20:38 +03:00
Markus Lehtonen
a269bf4d25 Drop the -enable-nodefeature-api flag
Was marked to be removed in v0.17.
2024-07-10 15:20:07 +03:00
Kubernetes Prow Robot
d2456e181a
Merge pull request #1726 from marquiz/devel/helm-cmdline-args
deployment/helm: enable specifying additional cmdline args
2024-07-09 02:09:52 -07:00
Markus Lehtonen
6515990cae docs: describe Kubernetes version compatibility in versions page
Bump the required Kubernetes version to v1.24. In practice this is the
minimum Kubernetes version as our deployment (both kustomize and Helm)
depend on the gRPC container probes feature of Kubernetes.
2024-07-08 15:28:25 +03:00
Oleg Zhurakivskyy
cb80feca81 Document AVXVNNIINT16 cpuid feature
Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2024-07-04 15:18:04 +03:00
budimanjojo
3d62382cd1
helm: remove defaults CPU limits
Signed-off-by: budimanjojo <budimanjojo@gmail.com>
2024-05-30 11:55:34 +07:00
Markus Lehtonen
a088de7333 deployment/helm: enable specifying additional cmdline args 2024-05-28 20:09:08 +03:00
Markus Lehtonen
bec9297fe7 docs: add more cross-references to NodeFeatureGroup API 2024-05-27 13:41:15 +03:00
Markus Lehtonen
28c852c9bd docs/helm: document all feature gates
Also, small correction to the description of the
featureGates.NodeFeatureAPI parameter.
2024-05-24 16:02:31 +03:00