1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2025-04-15 08:46:26 +00:00
Commit graph

49 commits

Author SHA1 Message Date
AhmedGrati
02b3b7c7e0 feat: add enableTaints to helm chart
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-03-21 10:49:24 +01:00
Talor Itzhak
91daff3b59 deployment/helm: update helm charts
Adding kubelet state directory mount

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2023-03-16 11:51:45 +02:00
Kubernetes Prow Robot
37504109d6
Merge pull request #1080 from marquiz/devel/deploy-topology-updater
deployment: fixes for mounting kubelet config
2023-03-09 09:50:02 -08:00
Markus Lehtonen
ed8a87b131 helm: fix handling of topologyUpdater.kubeletConfigPath
By default we use the configz API endpoint so no mounts are needed.
2023-03-09 17:49:31 +02:00
Markus Lehtonen
40644aab60 helm: create topology-updater RBAC rules by default
Create RBAC rules if topology-updater is enabled. Previously installing
with topologyUpdater.enable=true (without
topologyUpdater.rbac.create=true) resulted in a crashloogbackoff as RBAC
was missing.
2023-03-09 16:16:09 +02:00
Markus Lehtonen
40d7139257 helm: fix topology-updater rbac clusterrole
Access to nodes/proxy resource was accidentally given to nfd-master
(which really doesn't need it), not topology-updater.
2023-03-09 16:15:03 +02:00
Jose Luis Ojosnegros Manchón
b340d112a8 topology-updater:compute pod set fingerprint
Add an option to compute the fingerprint of the current pod set on each
node.

Report this new fingerprint using an attribute in NRT object.
2023-02-22 10:22:50 +01:00
Kubernetes Prow Robot
a92614c292
Merge pull request #1051 from AhmedGrati/feat-add-deny-label-ns-with-wildcard
feat: add deny-label-ns flag which supports wildcard
2023-02-15 03:42:25 -08:00
AhmedGrati
b499799364 feat: add deny-label-ns flag which supports wildcard
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-02-15 09:47:00 +01:00
Jose Luis Ojosnegros Manchón
d1d1eda0d2 nrt-api: Update to v0.1.0 to use v1alpha2 2023-02-09 12:03:18 +01:00
Kubernetes Prow Robot
94ab0ddd3d
Merge pull request #1045 from AhmedGrati/feat-disable-service-links-nfd-master
deployment: disable service links in NFD master pod
2023-02-06 08:55:01 -08:00
AhmedGrati
07d5ffe4b8 helm: make master port configurable
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-02-01 10:03:06 +01:00
AhmedGrati
743c877ad8 deployment: disable service links in NFD master pod
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-01-27 16:55:18 +01:00
PiotrProkop
59afae50ba Add NodeResourceTopology garbage collector
NodeResourceTopology(aka NRT) custom resource is used to enable NUMA aware Scheduling in Kubernetes.
As of now node-feature-discovery daemons are used to advertise those
resources but there is no service responsible for removing obsolete
objects(without corresponding Kubernetes node).

This patch adds new daemon called nfd-topology-gc which removes old
NRTs.

Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2023-01-11 10:15:21 +01:00
Markus Lehtonen
59a2757115 Use single-dash format for nfd cmdline flags
Use the "single-dash" version of nfd command line flags in deployment
files and e2e-tests. No impact in functionality, just aligns with
documentation and other parts of the codebase.
2022-12-21 15:00:49 +02:00
Markus Lehtonen
9f0806593d nfd-master: rename -featurerules-controller flag to -crd-controller
Deprecate the '-featurerules-controller' command line flag as the name
does not describe the functionality anymore: in practice it controls the
CRD controller handling both NodeFeature and NodeFeatureRule objects.
The patch introduces a duplicate, more generally named, flag
'-crd-controller'. A warning is printed in the log if
'-featurerules-controller' flag is encountered.
2022-12-14 10:23:45 +02:00
Markus Lehtonen
6ddd87e465 nfd-master: support NodeFeature objects
Add initial support for handling NodeFeature objects. With this patch
nfd-master watches NodeFeature objects in all namespaces and reacts to
changes in any of these. The node which a certain NodeFeature object
affects is determined by the "nfd.node.kubernetes.io/node-name"
annotation of the object. When a NodeFeature object targeting certain
node is changed, nfd-master needs to process all other objects targeting
the same node, too, because there may be dependencies between them.

Add a new command line flag for selecting between gRPC and NodeFeature
CRD API as the source of feature requests. Enabling NodeFeature API
disables the gRPC interface.

 -enable-nodefeature-api   enable NodeFeature CRD API for incoming
                           feature requests, will disable the gRPC
                           interface (defaults to false)

It is not possible to serve gRPC and watch NodeFeature objects at the
same time. This is deliberate to avoid labeling races e.g. by nfd-worker
sending gRPC requests but NodeFeature objects in the cluster
"overriding" those changes (labels from the gRPC requests will get
overridden when NodeFeature objects are processed).
2022-12-14 07:31:28 +02:00
Markus Lehtonen
237494463b nfd-worker: support creating NodeFeatures object
Support the new NodeFeatures object of the NFD CRD api. Add two new
command line options to nfd-worker:

 -kubeconfig               specifies the kubeconfig to use for
                           connecting k8s api (defaults to empty which
                           implies in-cluster config)
 -enable-nodefeature-api   enable the NodeFeature CRD API for
                           communicating node features to nfd-master,
                           will also automatically disable gRPC
                           (defgault to false)

No config file option for selecting the API is available as there should
be no need for dynamically selecting between gRPC and CRD. The
nfd-master configuration must be changed in tandem and it is safer (and
avoid awkward configuration races) to configure the whole NFD deployment
at once.

Default behavior of nfd-worker is not changed i.e. NodeFeatures object
creation is not enabled by default (but must be enabled with the command
line flag).

The patch also updates the kustomize and Helm deployment, adding RBAC
rules for nfd-worker and updating the example worker configuration.
2022-12-14 07:31:28 +02:00
Kubernetes Prow Robot
776a8c335c
Merge pull request #980 from marquiz/devel/topology-updater
nfd-topology-updater: update NodeResourceTopology objects directly
2022-12-08 01:44:22 -08:00
Markus Lehtonen
f13ed2d91c nfd-topology-updater: update NodeResourceTopology objects directly
Drop the gRPC communication to nfd-master and connect to the Kubernetes
API server directly when updating NodeResourceTopology objects.
Topology-updater already has connection to the API server for listing
Pods so this is not that dramatic change. It also simplifies the code
a lot as there is no need for the NFD gRPC client and no need for
managing TLS certs/keys.

This change aligns nfd-topology-updater with the future direction of
nfd-worker where the gRPC API is being dropped and replaced by a
CRD-based API.

This patch also update deployment files and documentation to reflect
this change.
2022-12-08 11:03:22 +02:00
Kubernetes Prow Robot
f0ca0ffb5d
Merge pull request #979 from marquiz/fixes/helm-topology-updater
helm: fix mount name of topology-updater config
2022-12-07 05:28:40 -08:00
Kubernetes Prow Robot
66a4ce9488
Merge pull request #981 from tariq1890/svc-selector
nfd-master svc should select only nfd-master pods
2022-12-07 04:10:37 -08:00
Tariq Ibrahim
153815fa56 nfd-master svc should select only nfd-master pods 2022-12-05 17:45:26 -08:00
Markus Lehtonen
7840fe52e5 helm: fix mount name of topology-updater config 2022-12-02 17:18:57 +02:00
Markus Lehtonen
c1bdcd9511 helm: drop NodeFeatureRule CRD from templates
Helm 3 can manage CRDs in a more user friendly way. In fact, this now
causes deployment failure as Helm automatically tries to install the CRD
from the "crds/" subdir, too.
2022-12-02 14:56:59 +02:00
Talor Itzhak
f832a7e4a8 helm: topology-updater: enable the configuration via helm
- Add a helm template with a config example for the exclude-list.
- Add mount for the topology-updater.conf file
- Update the templates Makefile target

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2022-11-21 21:31:37 +02:00
Garrybest
3ec1b94020 get kubelet config from configz
Signed-off-by: Garrybest <garrybest@foxmail.com>
2022-11-08 23:52:35 +08:00
Kubernetes Prow Robot
a753d11e0b
Merge pull request #867 from stek29/worker-priority-class
helm: add priorityClassName to worker
2022-08-23 07:23:31 -07:00
Viktor Oreshkin
7498e49ba5 helm: add priorityClassName to worker
Signed-off-by: Viktor Oreshkin <imselfish@stek29.rocks>
2022-08-22 06:45:52 +03:00
Markus Lehtonen
acdc632935 helm: rename "manifests" subdir to "crds"
Rename the Helm subdir that contains CRD(s) to match the expected chart
directory structure.
2022-08-19 14:58:01 +03:00
jasine
76df597c19
helm: add namespace override for multi-namespace deployments
When used as other charts' dependency, helm will install manifests of this chart to parent chart's namespace, if subchart needs to install to another namespace, helm recommend to use namespaceOverride (helm/charts#15202)
2022-06-28 00:08:29 +08:00
Cyril Corbon
eeb1f0d5e5
helm: add annotations to daemonset and deployment
Signed-off-by: Cyril Corbon <cyril.corbon@dailymotion.com>
2022-03-24 12:13:29 +01:00
Mikko Ylinen
9bbb960d35 deployment/helm: add resourceLabels to master args
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-03-23 06:59:49 +02:00
Mac Chaffee
7ec13f0dc1
Add ServiceAccount for nfd-worker
Signed-off-by: Mac Chaffee <me@macchaffee.com>

This commit creates a separate ServiceAccount for the nfd-worker like the
other components.

Even though the nfd-worker doesn't need any special RBAC permissions, this
feature is useful for nvidia/gpu-operator (a downstream project) which
supports PodSecurityPolicies. But since nfd-worker doesn't have its own
ServiceAccount, they've bolted on this feature into their fork, which is
giving them issues.

PodSecurityPolicies are used to grant special permission to nfd-worker to
create hostPath volumes.
2022-02-28 16:17:16 -05:00
Kubernetes Prow Robot
ffb6a294e5
Merge pull request #699 from marquiz/devel/helm-featurerule-controller
deployment/helm: disable nfr controller for parallel instances
2022-01-05 06:08:34 -08:00
Markus Lehtonen
edb3e6824c deployment/helm: disable nfr controller for parallel instances
Change the helm chart so that the NodeFeatureRule controller will be
disabled for other than the default deployment (i.e. all deployments
where master.instance is non-empty), unless explicitly set to true. With
this we try to ensure that there is only on controller instance for the
CR, avoiding contention and conflicts.
2022-01-04 21:25:02 +02:00
Markus Lehtonen
812073a025 deployment/helm: refactor nfd-master rbac parameters
Move top-level serviceAccount and rbac fields under master, making the
Helm chart more coherent.

Also, drop unused rbac.serviceAccountName and
rbac.serviceAccountAnnotations from values.yaml.
2022-01-04 16:30:11 +02:00
Dave Baker
b0834d7862 Enable TLS and cert-manager created certs for helm chart 2022-01-04 12:27:02 +00:00
Markus Lehtonen
e8872462dc nfd-master: add -featurerules-controller flag
Add a new command line flag for disabling/enabling the controller for
NodeFeatureRule objects. In practice, disabling the controller disables
all labels generated from rules in NodeFeatureRule objects.
2021-11-22 16:57:42 +02:00
Markus Lehtonen
e6e32a88c3 nfd-master: implement controller for NodeFeatureRule CRs
Implement a simple controller stub that operates on NodeFeatureRule
objects. The controller does not yet have any functionality other than
logging changes in the (NodeFeatureRule) objecs it is watching.

Also update the documentation on the -no-publish flag to match the new
functionality.
2021-11-22 16:57:42 +02:00
Markus Lehtonen
c3e2315834 pkg/apis/nfd: specify CRD for custom labeling rules
Add a cluster-scoped Custom Resource Definition for specifying labeling
rules. Nodes (node features, node objects) are cluster-level objects and
thus the natural and encouraged setup is to only have one NFD deployment
per cluster - the set of underlying features of the node stays the same
independent of how many parallel NFD deployments you have. Our extension
points (hooks, feature files and now CRs) can be be used by multiple
actors (depending on us) simultaneously. Having the CRD cluster-scoped
hopefully drives deployments in this direction. It also should make
deployment of vendor-specific labeling rules easy as there is no need to
worry about the namespace.

This patch virtually replicates the source.custom.FeatureSpec in a CRD
API (located in the pkg/apis/nfd/v1alpha1 package) with the notable
exception that "MatchOn" legacy rules are not supported. Legacy rules
are left out in order to keep the CRD simple and clean.

The duplicate functionality in source/custom will be dropped by upcoming
patches.

This patch utilizes controller-gen (from sigs.k8s.io/controller-tools)
for generating the CRD and deepcopy methods. Code can be (re-)generated
with "make generate". Install controller-gen with:

  go install sigs.k8s.io/controller-tools/cmd/controller-gen@v0.7.0

Update kustomize and helm deployments to deploy the CRD.
2021-11-17 13:40:23 +02:00
Swati Sehgal
b444ef95a8 NFD-Topology-Updater: Bump NRT API to version v0.0.12
The NodeResourceTopology API has been made cluster
scoped as in the current context a CR corresponds to
a Node and since Node is a cluster scoped resource it
makes sense to make NRT cluster scoped as well.

Ref: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/18
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-11-16 13:28:23 +00:00
Elias Koromilas
e22b937391 Implicitly generate the worker ConfigMap name
Signed-off-by: Elias Koromilas <elias.koromilas@gmail.com>
2021-11-03 11:21:58 +02:00
Wei Zhang
158a5590ab deployment: add topology updater helm chart
Signed-off-by: Wei Zhang <kweizh@gmail.com>
2021-10-26 10:52:40 +08:00
Elias Koromilas
c17a898c4c
deployment: Simplify NFD worker configuration in Helm (#627)
* Simplify NFD worker service configuration in Helm

Signed-off-by: Elias Koromilas <elias.koromilas@gmail.com>

* Update docs/get-started/deployment-and-usage.md

Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>

Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
2021-10-25 09:34:23 -07:00
Markus Lehtonen
890d9455f1 deployment/helm: don't force sleep-interval in worker cmdline flags
Drop --sleep-interval from the template. We really don't want to do that
as. First, it's the default value so no use repeating that in the
template. And more importantly, the commandline flag will override
anything that will be provided in the worker config file, making it
impossible for users to specify the sleep interval (other than by
editing the template directly).
2021-10-21 11:33:19 +03:00
Jorik Jonker
501ff37592 deployment: optional mount of /usr/src
This commit makes the mount of /usr/src optional in the Helm chart, and
removes it from the kustomization. Reason is that some systems do not
have a /usr/src (such as Talos) *and* have a R/O filesystem. Since
/usr/src is optional per FHS 3.0, NFD should not assume its presence.

Signed-off-by: Jorik Jonker <jorik@kippendief.biz>
2021-08-26 10:52:26 +02:00
Carlos Eduardo Arango Gutierrez
dece85b394
Add livenessProbe via grpc to nfd-master
Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>
2021-08-18 10:23:10 -05:00
Markus Lehtonen
0f2554abf1 helm: move files under deployment/helm 2021-08-16 14:44:26 +03:00