1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2025-03-17 13:58:21 +00:00
Commit graph

19 commits

Author SHA1 Message Date
Markus Lehtonen
0a96359f29 deployment: fix mistake in example worker config 2021-11-23 10:01:41 +02:00
Kubernetes Prow Robot
99d3251c42
Merge pull request #649 from marquiz/devel/storage-feature-source
source/storage: implement FeatureSource
2021-11-22 11:31:32 -08:00
Markus Lehtonen
e8872462dc nfd-master: add -featurerules-controller flag
Add a new command line flag for disabling/enabling the controller for
NodeFeatureRule objects. In practice, disabling the controller disables
all labels generated from rules in NodeFeatureRule objects.
2021-11-22 16:57:42 +02:00
Markus Lehtonen
e6e32a88c3 nfd-master: implement controller for NodeFeatureRule CRs
Implement a simple controller stub that operates on NodeFeatureRule
objects. The controller does not yet have any functionality other than
logging changes in the (NodeFeatureRule) objecs it is watching.

Also update the documentation on the -no-publish flag to match the new
functionality.
2021-11-22 16:57:42 +02:00
Markus Lehtonen
999628418b source/storage: implement FeatureSource
Separate feature discovery and creation of feature labels. Generalize
the feature discovery so that block devices can be matched in custom
label rules in a similar fashion as pci and usb devices. This extends
the discovery to other block queue attributes than 'rotational': now we
also detect 'dax', 'nr_zones' and 'zoned'.

Labels created by the storage feature source are unchanged. The new
features being detected are available in custom rules only.

Example custom rules:

  - name: "my block rule 1"
    labels:
      my-block-feature-1: "true"
    matchFeatures:
      - feature: storage.block
          "rotational": {op: In, value: ["0"]}

  - name: "my block rule 2"
    labels:
      my-block-feature-2: "true"
    matchFeatures:
      - feature: storage.block
          "zoned": {op: In, value: [“host-aware”, “host-managed”]}

Also, add minimalist unit test.
2021-11-18 14:58:33 +02:00
Markus Lehtonen
b96b86bc6c pkg/apis/nfd: drop excess field from the CRD
Drop stale leftover "LabelsTemplate" field from the rule spec.
2021-11-17 16:40:28 +02:00
Markus Lehtonen
c3e2315834 pkg/apis/nfd: specify CRD for custom labeling rules
Add a cluster-scoped Custom Resource Definition for specifying labeling
rules. Nodes (node features, node objects) are cluster-level objects and
thus the natural and encouraged setup is to only have one NFD deployment
per cluster - the set of underlying features of the node stays the same
independent of how many parallel NFD deployments you have. Our extension
points (hooks, feature files and now CRs) can be be used by multiple
actors (depending on us) simultaneously. Having the CRD cluster-scoped
hopefully drives deployments in this direction. It also should make
deployment of vendor-specific labeling rules easy as there is no need to
worry about the namespace.

This patch virtually replicates the source.custom.FeatureSpec in a CRD
API (located in the pkg/apis/nfd/v1alpha1 package) with the notable
exception that "MatchOn" legacy rules are not supported. Legacy rules
are left out in order to keep the CRD simple and clean.

The duplicate functionality in source/custom will be dropped by upcoming
patches.

This patch utilizes controller-gen (from sigs.k8s.io/controller-tools)
for generating the CRD and deepcopy methods. Code can be (re-)generated
with "make generate". Install controller-gen with:

  go install sigs.k8s.io/controller-tools/cmd/controller-gen@v0.7.0

Update kustomize and helm deployments to deploy the CRD.
2021-11-17 13:40:23 +02:00
Kubernetes Prow Robot
c29ab3bb35
Merge pull request #652 from k8stopologyawareschedwg/consume-cluster-scoped-api
NFD-Topology-Updater: Bump NRT API to version v0.0.12
2021-11-16 07:13:27 -08:00
Swati Sehgal
b444ef95a8 NFD-Topology-Updater: Bump NRT API to version v0.0.12
The NodeResourceTopology API has been made cluster
scoped as in the current context a CR corresponds to
a Node and since Node is a cluster scoped resource it
makes sense to make NRT cluster scoped as well.

Ref: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/18
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-11-16 13:28:23 +00:00
Markus Lehtonen
6cbed379df source/custom: implement matchAny directive
Implement a new 'matchAny' directive in the new rule format, building on
top of the previously implemented 'matchFeatures' matcher. MatchAny
applies a logical OR over multiple matchFeatures directives. That is, it
allows specifying multiple alternative matchers (at least one of which
must match) in a single label rule.

The configuration format for the new matchers is

  matchAny:
    - matchFeatures:
        - feature: <domain>.<feature>
          matchExpressions:
            <attribute>:
              op: <operator>
              value:
                - <list-of-values>
    - matchFeatures:
      ...

A configuration example. In order to require a cpu feature, kernel
module and one of two specific PCI devices (taking use of the shortform
notation):

  - name: multi-device-test
    labels:
      multi-device-feature: "true"
    matchFeatures:
      - feature: kernel.loadedmodule
        matchExpressions: [driver-module]
      - feature: cpu.cpuid
        matchExpressions: [AVX512F]
    matchAny:
      - matchFeatures:
          - feature; pci.device
            matchExpressions:
              vendor: "8086"
              device: "1234"
      - matchFeatures:
          - feature: pci.device
            matchExpressions:
              vendor: "8086"
              device: "abcd"
2021-11-12 16:51:30 +02:00
Markus Lehtonen
e206f0b86b source/custom: implement generic feature matching
Implement generic feature matchers that cover all feature sources (that
implement the FeatureSource interface). The implementation relies on the
unified data model provided by the FeatureSource interface as well as
the generic expression-based rule processing framework that was added to
the source/custom/expression package.

With this patch any new features added will be automatically available
for custom rules, without any additional work. Rule hierarchy follows
the source/feature hierarchy by design.

This patch introduces a new format for custom rule specifications,
dropping the 'value' field and introducing new 'labels' field which
makes it possible to specify multiple labels per rule. Also, in the new
format the 'name' field is just for reference and no matching label is
created. The new generic rules are available in this new rule format
under a 'matchFeatures. MatchFeatures implements a logical AND over
an array of per-feature matchers - i.e. a match for all of the matchers
is required. The goal of the new rule format is to make it better follow
K8s API design guidelines and make it extensible for future enhancements
(e.g. addition of templating, taints, annotations, extended resources
etc).

The old rule format (with cpuID, kConfig, loadedKMod, nodename, pciID,
usbID rules) is still supported. The rule format (new vs. old) is
determined at config parsing time based on the existence of the
'matchOn' field.

The new rule format and the configuration format for the new
matchFeatures field is

  - name: <rule-name>
    labels:
      <key>: <value>
      ...
    matchFeatures:
      - feature: <domain>.<feature>
        matchExpressions:
          <attribute>:
            op: <operator>
            value:
              - <list-of-values>
      - feature: <domain>.<feature>
        ...

Currently, "cpu", "kernel", "pci", "system", "usb" and "local" sources
are covered by the matshers/feature selectors. Thus, the following
features are available for matching with this patch:

  - cpu.cpuid:
      <cpuid-flag>: <exists/does-not-exist>
  - cpu.cstate:
      enabled: <bool>
  - cpu.pstate:
      status: <string>
      turbo: <bool>
      scaling_governor: <string>
  - cpu.rdt:
      <rdt-feature>: <exists/does-not-exist>
  - cpu.sst:
      bf.enabled: <bool>
  - cpu.topology:
      hardware_multithreading: <bool>
  - kernel.config:
      <flag-name>: <string>
  - kernel.loadedmodule:
      <module-name>: <exists/does-not-exist>
  - kernel.selinux:
      enabled: <bool>
  - kernel.version:
      major: <int>
      minor: <int>
      revision: <int>
      full: <string>
  - system.osrelease:
      <key-name>: <string>
      VERSION_ID.major: <int>
      VERSION_ID.minor: <int>
  - system.name:
      nodename: <string>
  - pci.device:
      <device-instance>:
        class: <string>
        vendor: <string>
        device: <string>
        subsystem_vendor: <string>
        susbystem_device: <string>
        sriov_totalvfs: <int>
  - usb.device:
      <device-instance>:
        class: <string>
        vendor: <string>
        device: <string>
        serial: <string>
  - local.label:
      <label-name>: <string>

The configuration also supports some "shortforms" for convenience:

   matchExpressions: [<attr-1>, <attr-2>=<val-2>]
   ---
   matchExpressions:
     <attr-3>:
     <attr-4>: <val-4>

is equal to:

   matchExpressions:
     <attr-1>: {op: Exists}
     <attr-2>: {op: In, value: [<val-2>]}
   ---
   matchExpressions:
     <attr-3>: {op: Exists}
     <attr-4>: {op: In, value: [<val-4>]}

In other words:

  - feature: kernel.config
    matchExpressions: ["X86", "INIT_ENV_ARG_LIMIT=32"]
  - feature: pci.device
    matchExpressions:
      vendor: "8086"

is the same as:

  - feature: kernel.config
    matchExpressions:
      X86: {op: Exists}
      INIT_ENV_ARG_LIMIT: {op: In, values: ["32"]}
  - feature: pci.device
    matchExpressions:
      vendor: {op: In, value: ["8086"]

Some configuration examples below. In order to match a CPUID feature the
following snippet can be used:

  - name: cpu-test-1
    labels:
      cpu-custom-feature: "true"
    matchFeatures:
      - feature: cpu.cpuid
        matchExpressions:
          AESNI: {op: Exists}
          AVX: {op: Exists}

In order to match against a loaded kernel module and OS version:

  - name: kernel-test-1
    labels:
      kernel-custom-feature: "true"
    matchFeatures:
      - feature: kernel.loadedmodule
        matchExpressions:
          e1000: {op: Exists}
      - feature: system.osrelease
        matchExpressions:
          NAME: {op: InRegexp, values: ["^openSUSE"]}
          VERSION_ID.major: {op: Gt, values: ["14"]}

In order to require a kernel module and both of two specific PCI devices:

  - name: multi-device-test
    labels:
      multi-device-feature: "true"
    matchFeatures:
      - feature: kernel.loadedmodule
        matchExpressions:
          driver-module: {op: Exists}
      - pci.device:
          vendor: "8086"
          device: "1234"
      - pci.device:
          vendor: "8086"
          device: "abcd"
2021-11-12 16:51:13 +02:00
Elias Koromilas
e22b937391 Implicitly generate the worker ConfigMap name
Signed-off-by: Elias Koromilas <elias.koromilas@gmail.com>
2021-11-03 11:21:58 +02:00
Wei Zhang
158a5590ab deployment: add topology updater helm chart
Signed-off-by: Wei Zhang <kweizh@gmail.com>
2021-10-26 10:52:40 +08:00
Elias Koromilas
c17a898c4c
deployment: Simplify NFD worker configuration in Helm (#627)
* Simplify NFD worker service configuration in Helm

Signed-off-by: Elias Koromilas <elias.koromilas@gmail.com>

* Update docs/get-started/deployment-and-usage.md

Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>

Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
2021-10-25 09:34:23 -07:00
Markus Lehtonen
890d9455f1 deployment/helm: don't force sleep-interval in worker cmdline flags
Drop --sleep-interval from the template. We really don't want to do that
as. First, it's the default value so no use repeating that in the
template. And more importantly, the commandline flag will override
anything that will be provided in the worker config file, making it
impossible for users to specify the sleep interval (other than by
editing the template directly).
2021-10-21 11:33:19 +03:00
Markus Lehtonen
3706de9308 deployment: fix formatting of the worker conf sample 2021-09-17 14:25:48 +03:00
Jorik Jonker
501ff37592 deployment: optional mount of /usr/src
This commit makes the mount of /usr/src optional in the Helm chart, and
removes it from the kustomization. Reason is that some systems do not
have a /usr/src (such as Talos) *and* have a R/O filesystem. Since
/usr/src is optional per FHS 3.0, NFD should not assume its presence.

Signed-off-by: Jorik Jonker <jorik@kippendief.biz>
2021-08-26 10:52:26 +02:00
Carlos Eduardo Arango Gutierrez
dece85b394
Add livenessProbe via grpc to nfd-master
Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>
2021-08-18 10:23:10 -05:00
Markus Lehtonen
0f2554abf1 helm: move files under deployment/helm 2021-08-16 14:44:26 +03:00