1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2025-03-05 08:17:04 +00:00
Commit graph

95 commits

Author SHA1 Message Date
Markus Lehtonen
237494463b nfd-worker: support creating NodeFeatures object
Support the new NodeFeatures object of the NFD CRD api. Add two new
command line options to nfd-worker:

 -kubeconfig               specifies the kubeconfig to use for
                           connecting k8s api (defaults to empty which
                           implies in-cluster config)
 -enable-nodefeature-api   enable the NodeFeature CRD API for
                           communicating node features to nfd-master,
                           will also automatically disable gRPC
                           (defgault to false)

No config file option for selecting the API is available as there should
be no need for dynamically selecting between gRPC and CRD. The
nfd-master configuration must be changed in tandem and it is safer (and
avoid awkward configuration races) to configure the whole NFD deployment
at once.

Default behavior of nfd-worker is not changed i.e. NodeFeatures object
creation is not enabled by default (but must be enabled with the command
line flag).

The patch also updates the kustomize and Helm deployment, adding RBAC
rules for nfd-worker and updating the example worker configuration.
2022-12-14 07:31:28 +02:00
Markus Lehtonen
d1c91e129a apis/nfd: update auto-generated code 2022-12-14 07:31:28 +02:00
Markus Lehtonen
59ebff46c9 apis/nfd: add CRD for communicating node features
Add a new NodeFeature CRD to the nfd Kubernetes API to communicate node
features over K8s api objects instead of gRPC. The new resource is
namespaced which will help the management of multiple NodeFeature
objects per node. This aims at enabling 3rd party detectors for custom
features.

In addition to communicating raw features the NodeFeature object also
has a field for directly requesting labels that should be applied on the
node object.

Rename the crd deployment file to nfd-api-crds.yaml so that it matches
the new content of the file. Also, rename the Helm subdir for CRDs to
match the expected chart directory structure.
2022-12-14 07:31:28 +02:00
Kubernetes Prow Robot
776a8c335c
Merge pull request #980 from marquiz/devel/topology-updater
nfd-topology-updater: update NodeResourceTopology objects directly
2022-12-08 01:44:22 -08:00
Markus Lehtonen
f13ed2d91c nfd-topology-updater: update NodeResourceTopology objects directly
Drop the gRPC communication to nfd-master and connect to the Kubernetes
API server directly when updating NodeResourceTopology objects.
Topology-updater already has connection to the API server for listing
Pods so this is not that dramatic change. It also simplifies the code
a lot as there is no need for the NFD gRPC client and no need for
managing TLS certs/keys.

This change aligns nfd-topology-updater with the future direction of
nfd-worker where the gRPC API is being dropped and replaced by a
CRD-based API.

This patch also update deployment files and documentation to reflect
this change.
2022-12-08 11:03:22 +02:00
Kubernetes Prow Robot
f0ca0ffb5d
Merge pull request #979 from marquiz/fixes/helm-topology-updater
helm: fix mount name of topology-updater config
2022-12-07 05:28:40 -08:00
Kubernetes Prow Robot
66a4ce9488
Merge pull request #981 from tariq1890/svc-selector
nfd-master svc should select only nfd-master pods
2022-12-07 04:10:37 -08:00
Kubernetes Prow Robot
9f68f6c93a
Merge pull request #910 from fmuyassarov/taint/feruz
Allow optionally setting node taints defined on the NodeFeatureRule CR
2022-12-06 07:28:37 -08:00
Tariq Ibrahim
153815fa56 nfd-master svc should select only nfd-master pods 2022-12-05 17:45:26 -08:00
Feruzjon Muyassarov
984a3de198 Document tainting feature
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
2022-12-02 17:29:10 +02:00
Feruzjon Muyassarov
532e1193ce Add taints field to NodeFeatureRule CR spec
Extend NodeFeatureRule Spec with taints field to allow users to
specify the list of the taints they want to be set on the node if
rule matches.

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
2022-12-02 17:25:00 +02:00
Markus Lehtonen
7840fe52e5 helm: fix mount name of topology-updater config 2022-12-02 17:18:57 +02:00
Markus Lehtonen
c1bdcd9511 helm: drop NodeFeatureRule CRD from templates
Helm 3 can manage CRDs in a more user friendly way. In fact, this now
causes deployment failure as Helm automatically tries to install the CRD
from the "crds/" subdir, too.
2022-12-02 14:56:59 +02:00
Markus Lehtonen
37d51c96f1 deployment: drop stale nfd-api-crds.yaml
Remove a stale unused file that was accidentally committed from an
experimental work.
2022-11-29 13:46:30 +02:00
Talor Itzhak
f832a7e4a8 helm: topology-updater: enable the configuration via helm
- Add a helm template with a config example for the exclude-list.
- Add mount for the topology-updater.conf file
- Update the templates Makefile target

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2022-11-21 21:31:37 +02:00
Talor Itzhak
8b5918a2e9 kustomize: topology-updater: enable the configuration via kustomization
Add a kustomization file with a config example for the exclude-list.

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2022-11-21 21:31:14 +02:00
Garrybest
3ec1b94020 get kubelet config from configz
Signed-off-by: Garrybest <garrybest@foxmail.com>
2022-11-08 23:52:35 +08:00
Markus Lehtonen
9ea787bc99 apis/nfd: update auto-generated code
Re-generate after the latest API change. Involves renaming the crd spec
files.
2022-10-18 18:41:53 +03:00
Feruzjon Muyassarov
60f270d40d Set shortName for NodeFeatureRule CRD
This patch adds a kubebuilder marker to add a short name nfr for
NodeFeatureRule CRD.

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
2022-09-28 12:18:49 +03:00
Kubernetes Prow Robot
8662d17530
Merge pull request #871 from fmuyassarov/disable-hook
Config option to disable hooks
2022-09-26 10:40:08 -07:00
Markus Lehtonen
98228d2069 Update auto-generated artefacts
Latest gofmt changes and update to go v1.19 induce some changes in the
generated files.
2022-09-08 12:45:20 +03:00
Feruzjon Muyassarov
56d5da2ce0 Add a config option to disable hooks of local feature
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
2022-09-01 10:58:31 +03:00
Kubernetes Prow Robot
a753d11e0b
Merge pull request #867 from stek29/worker-priority-class
helm: add priorityClassName to worker
2022-08-23 07:23:31 -07:00
Viktor Oreshkin
7498e49ba5 helm: add priorityClassName to worker
Signed-off-by: Viktor Oreshkin <imselfish@stek29.rocks>
2022-08-22 06:45:52 +03:00
Markus Lehtonen
acdc632935 helm: rename "manifests" subdir to "crds"
Rename the Helm subdir that contains CRD(s) to match the expected chart
directory structure.
2022-08-19 14:58:01 +03:00
Markus Lehtonen
38e763e36c Refresh auto-generated files 2022-08-10 14:24:33 +03:00
jasine
76df597c19
helm: add namespace override for multi-namespace deployments
When used as other charts' dependency, helm will install manifests of this chart to parent chart's namespace, if subchart needs to install to another namespace, helm recommend to use namespaceOverride (helm/charts#15202)
2022-06-28 00:08:29 +08:00
Cyril Corbon
eeb1f0d5e5
helm: add annotations to daemonset and deployment
Signed-off-by: Cyril Corbon <cyril.corbon@dailymotion.com>
2022-03-24 12:13:29 +01:00
Mikko Ylinen
9bbb960d35 deployment/helm: add resourceLabels to master args
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-03-23 06:59:49 +02:00
Mac Chaffee
7ec13f0dc1
Add ServiceAccount for nfd-worker
Signed-off-by: Mac Chaffee <me@macchaffee.com>

This commit creates a separate ServiceAccount for the nfd-worker like the
other components.

Even though the nfd-worker doesn't need any special RBAC permissions, this
feature is useful for nvidia/gpu-operator (a downstream project) which
supports PodSecurityPolicies. But since nfd-worker doesn't have its own
ServiceAccount, they've bolted on this feature into their fork, which is
giving them issues.

PodSecurityPolicies are used to grant special permission to nfd-worker to
create hostPath volumes.
2022-02-28 16:17:16 -05:00
Kubernetes Prow Robot
885a061f12
Merge pull request #701 from marquiz/devel/deployment-custom-rule
deployment: use new custom rule format in sample configs
2022-01-05 09:53:48 -08:00
Kubernetes Prow Robot
ffb6a294e5
Merge pull request #699 from marquiz/devel/helm-featurerule-controller
deployment/helm: disable nfr controller for parallel instances
2022-01-05 06:08:34 -08:00
Markus Lehtonen
edb3e6824c deployment/helm: disable nfr controller for parallel instances
Change the helm chart so that the NodeFeatureRule controller will be
disabled for other than the default deployment (i.e. all deployments
where master.instance is non-empty), unless explicitly set to true. With
this we try to ensure that there is only on controller instance for the
CR, avoiding contention and conflicts.
2022-01-04 21:25:02 +02:00
Markus Lehtonen
812073a025 deployment/helm: refactor nfd-master rbac parameters
Move top-level serviceAccount and rbac fields under master, making the
Helm chart more coherent.

Also, drop unused rbac.serviceAccountName and
rbac.serviceAccountAnnotations from values.yaml.
2022-01-04 16:30:11 +02:00
Kubernetes Prow Robot
ec15f4f24c
Merge pull request #712 from dbaker-rh/helm-certs
Enable TLS and cert-manager created certs for helm chart
2022-01-04 06:24:52 -08:00
Dave Baker
3e6ae535c7 Fix kustomization template to work with cert-manager 2022-01-04 13:19:09 +00:00
Dave Baker
b0834d7862 Enable TLS and cert-manager created certs for helm chart 2022-01-04 12:27:02 +00:00
Markus Lehtonen
7e8f96e7e1 deployment: drop legacy custom rules from the worker conf sample
They are still supported but no need to advertise them.
2021-12-22 09:21:26 +02:00
Markus Lehtonen
468fa2b817 deployment: use new rule format in sample custom rule overlay 2021-12-22 09:21:26 +02:00
Markus Lehtonen
df6909ed5e nfd-worker: add core.featureSources config option
Add a configuration option for controlling the enabled "raw" feature
sources. This is useful e.g. in testing and development, plus it also
allows fully shutting down discovery of features that are not needed in
a deployment. Supplements core.labelSources which controls the
enablement of label sources.
2021-12-03 09:42:35 +02:00
Markus Lehtonen
ad9c7dfa1e nfd-worker: rename config option 'sources' to 'labelSources'
The goal is to make the name more descriptive. Also keeping in mind a
possible future addition a 'featureSources' option (or similar) for
controlling the feature discovery.
2021-12-01 17:11:49 +02:00
Markus Lehtonen
b648d005e1 pkg/apis/nfd: support templating of "vars"
Support templating of var names in a similar manner as labels. Add
support for a new 'varsTemplate' field to the feature rule spec which is
treated similarly to the 'labelsTemplate' field. The value of the field
is processed through the golang "text/template" template engine and the
expanded value must contain variables in <key>=<value> format, separated
by newlines i.e.:

  - name: <rule-name>
    varsTemplate: |
      <label-1>=<value-1>
      <label-2>=<value-2>
      ...

Similar rules as for 'labelsTemplate' apply, i.e.

1. In case of matchAny is specified, the template is executed separately
   against each individual matchFeatures matcher.
2. 'vars' field has priority over 'varsTemplate'
2021-11-25 12:50:47 +02:00
Markus Lehtonen
f75303ce43 pkg/apis/nfd: add variables to rule spec and support backreferences
Support backreferencing of output values from previous rules. Enables
complex rule setups where custom features are further combined together
to form even more sophisticated higher level labels. The labels created
by preceding rules are available as a special 'rule.matched' feature
(for matchFeatures to use).

If referencing rules accross multiple configs/CRDs care must be taken
with the ordering. Processing order of rules in nfd-worker:

1. Static rules
2. Files from /etc/kubernetes/node-feature-discovery/custom.d/
   in alphabetical order. Subdirectories are processed by reading their
   files in alphabetical order.
3. Custom rules from main nfd-worker.conf

In nfd-master, NodeFeatureRule objects are processed in alphabetical
order (based on their metadata.name).

This patch also adds new 'vars' fields to the rule spec. Like 'labels',
it is a map of key-value pairs but no labels are generated from these.
The values specified in 'vars' are only added for backreferencing into
the 'rules.matched' feature. This may by desired in schemes where the
output of certain rules is only used as intermediate variables for other
rules and no labels out of these are wanted.

An example setup:

  - name: "kernel feature"
    labels:
      kernel-feature:
    matchFeatures:
      - feature: kernel.version
        matchExpressions:
          major: {op: Gt, value: ["4"]}

  - name: "intermediate var feature"
    vars:
      nolabel-feature: "true"
    matchFeatures:
      - feature: cpu.cpuid
        matchExpressions:
          AVX512F: {op: Exists}
      - feature: pci.device
        matchExpressions:
          vendor: {op: In, value: ["8086"]}
          device: {op: In, value: ["1234", "1235"]}

  - name: top-level-feature
    matchFeatures:
      - feature: rule.matched
        matchExpressions:
          kernel-feature: "true"
          nolabel-feature: "true"
2021-11-25 12:50:47 +02:00
Kubernetes Prow Robot
da484b7bd3
Merge pull request #550 from marquiz/devel/custom-templating
Templating of custom label names
2021-11-23 12:02:51 -08:00
Markus Lehtonen
c8d73666d6 pkg/apis/nfd: support label name templating
Support templating of label names in feature rules. It is available both
in NodeFeatureRule CRs and in custom rule configuration of nfd-worker.

This patch adds a new 'labelsTemplate' field to the rule spec, making it
possible to dynamically generate multiple labels per rule based on the
matched features. The feature relies on the golang "text/template"
package.  When expanded, the template must contain labels in a raw
<key>[=<value>] format (where 'value' defaults to "true"), separated by
newlines i.e.:

  - name: <rule-name>
    labelsTemplate: |
      <label-1>[=<value-1>]
      <label-2>[=<value-2>]
      ...

All the matched features of 'matchFeatures' directives are available for
templating engine in a nested data structure that can be described in
yaml as:

.
  <domain-1>:
      <key-feature-1>:
        - Name: <matched-key>
        - ...

      <value-feature-1:
        - Name: <matched-key>
          Value: <matched-value>
        - ...

      <instance-feature-1>:
        - <attribute-1-name>: <attribute-1-value>
          <attribute-2-name>: <attribute-2-value>
          ...
        - ...

  <domain-2>:
     ...

That is, the per-feature data available for matching depends on the type
of feature that was matched:

- "key features": only 'Name' is available
- "value features": 'Name' and 'Value' can be used
- "instance features": all attributes of the matched instance are
   available

NOTE: In case of matchAny is specified, the template is executed
separately against each individual matchFeatures matcher and the
eventual set of labels is a superset of all these expansions.  Consider
the following:

  - name: <name>
    labelsTemplate: <template>
    matchAny:
      - matchFeatures: <matcher#1>
      - matchFeatures: <matcher#2>
    matchFeatures: <matcher#3>

In the example above (assuming the overall result is a match) the
template would be executed on matcher#1 and/or matcher#2 (depending on
whether both or only one of them match), and finally on matcher#3, and
all the labels from these separate expansions would be created (i.e. the
end result would be a union of all the individual expansions).

NOTE 2: The 'labels' field has priority over 'labelsTemplate', i.e.
labels specified in the 'labels' field will override any labels
originating from the 'labelsTemplate' field.

A special case of an empty match expression set matches everything (i.e.
matches/returns all existing keys/values). This makes it simpler to
write templates that run over all values. Also, makes it possible to
later implement support for templates that run over all _keys_ of a
feature.

Some example configurations:

  - name: "my-pci-template-features"
    labelsTemplate: |
      {{ range .pci.device }}intel-{{ .class }}-{{ .device }}=present
      {{ end }}
    matchFeatures:
      - feature: pci.device
        matchExpressions:
          class: {op: InRegexp, value: ["^06"]}
          vendor: ["8086"]

  - name: "my-system-template-features"
    labelsTemplate: |
      {{ range .system.osrelease }}system-{{ .Name }}={{ .Value }}
      {{ end }}
    matchFeatures:
      - feature: system.osRelease
        matchExpressions:
          ID: {op: Exists}
          VERSION_ID.major: {op: Exists}

Imaginative template pipelines are possible, of course, but care must be
taken in order to produce understandable and maintainable rule sets.
2021-11-23 21:03:22 +02:00
Markus Lehtonen
c3da439d21 source/memory: implement FeatureSource
Separate feature discovery and creation of feature labels.

Generalize the discovery of nvdimm devices so that they can be matched
in custom label rules in a similar fashion as pci and usb devices.
Available attributes for matching nvdimm devices are limited to:

- devtype
- mode

For numa we now detect the number of numa nodes which can be matched
agains in custom label rules.

Labels created by the memory feature source are unchanged. The new
features being detected are available in custom rules only.

Example custom rule:

  - name: "my memory rule"
    labels:
      my-memory-feature: "true"
    matchFeatures:
      - feature: memory.numa
        matchExpressions:
          "node_count": {op: Gt, value: ["3"]}
      - feature: memory.nv
        matchExpressions:
          "devtype" {op: In, value: ["nd_dax"]}

Also, add minimalist unit test.
2021-11-23 15:08:15 +02:00
Markus Lehtonen
9a02b544a2 source/network: implement FeatureSource
Separate feature discovery and creation of feature labels. Generalize
the feature discovery so that network devices can be matched in custom
label rules in a similar fashion as pci and usb devices. Available
attributes for matching are:

- operstate
- speed
- sriov_numvfs
- sriov_totalvfs

Labels created by the network feature source are unchanged. The new
features being detected are available in custom rules only.

Example custom rule:

  - name: "my network rule"
    labels:
      my-network-feature: "true"
    matchFeatures:
      - feature: network.device
        matchExpressions:
          "operstate": { op: In, value: ["up"] }
          "sriov_numvfs": { op: Gt, value: ["9"] }

Also, add minimalist unit test.
2021-11-23 10:05:38 +02:00
Markus Lehtonen
0a96359f29 deployment: fix mistake in example worker config 2021-11-23 10:01:41 +02:00
Kubernetes Prow Robot
99d3251c42
Merge pull request #649 from marquiz/devel/storage-feature-source
source/storage: implement FeatureSource
2021-11-22 11:31:32 -08:00
Markus Lehtonen
e8872462dc nfd-master: add -featurerules-controller flag
Add a new command line flag for disabling/enabling the controller for
NodeFeatureRule objects. In practice, disabling the controller disables
all labels generated from rules in NodeFeatureRule objects.
2021-11-22 16:57:42 +02:00