1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00
Commit graph

154 commits

Author SHA1 Message Date
Markus Lehtonen
bd74de992f source/local: log features per each hook and feature file
With `-v 4` user can now see the features originating per each file.
Makes debugging easier.
2021-12-09 19:14:17 +02:00
Markus Lehtonen
82e14300a4 source/fake: implement FeatureSource
Makes it possible to create fake features for custom rules, enabling
testing.
2021-12-07 10:34:41 +02:00
Markus Lehtonen
969151e1b8 source/fake: drop stale Configure() method
Also correct some comments.
2021-12-07 10:34:41 +02:00
Markus Lehtonen
a5e2d4ca16 source: make per-source unit tests stricter
Now the tests check that GetLabels() works even without calling
Discover() at all.
2021-12-03 10:26:26 +02:00
Markus Lehtonen
6f881a4822 source/kernel: unmangled kconfig values for custom rules
Stop converting "=y" and "=m" to "true" for the raw feature values used
in "kernel.config" custom rule processing.

In practice, this means that to check if a kernel config flag has been
set to "y" or "m", one needs to explicitly check for both of the values:

  matchFeatures:
    - feature: kernel.config
      matchExpressions:
        FOO: {op: In, value: ["y", "m"]}

instead of (how it used to be):

  matchFeatures:
    - feature: kernel.config
      matchExpressions:
        FOO: {op: IsTrue}

The legacy kconfig custom rule is unchanged as are the
kernel-config.<flag> feature labels.
2021-12-02 10:06:21 +02:00
Markus Lehtonen
0df88ac6f6 source/kernel: drop length check of kconfig values
Do not do length checking here. We do not need/want to limit the values
here because they could still be used in custom rules. Moreover, we do
more proper validation of label all label values in nfd-worker, anyway.
2021-12-01 17:21:53 +02:00
Kubernetes Prow Robot
5ae7b8bb7a
Merge pull request #683 from marquiz/devel/kconfig-regexp
source/kernel: ditch regexp in kconfig parsing
2021-12-01 06:53:15 -08:00
Kubernetes Prow Robot
a86318fb92
Merge pull request #675 from jschintag/s390x-new-cpu-flags
source/cpu: add additional IBM Z CPU Flags
2021-11-30 04:52:56 -08:00
Markus Lehtonen
d4efc2ce0b source/kernel: ditch regexp in kconfig parsing
Regexp is just a resource hog here and does not simplify the code.
2021-11-30 14:43:05 +02:00
Jan Schintag
c545cfb0dc source/cpu: add additional IBM Z CPU Flags
Add the newly supported s390x CPU Flags for the 5.15 Kernel

Signed-off-by: Jan Schintag <jan.schintag@de.ibm.com>
2021-11-25 13:41:57 +01:00
Jan Schintag
476756ab0c source/cpu: Fix compile error for non-amd64 arches
Signed-off-by: Jan Schintag <jan.schintag@de.ibm.com>
2021-11-25 13:41:57 +01:00
Markus Lehtonen
f75303ce43 pkg/apis/nfd: add variables to rule spec and support backreferences
Support backreferencing of output values from previous rules. Enables
complex rule setups where custom features are further combined together
to form even more sophisticated higher level labels. The labels created
by preceding rules are available as a special 'rule.matched' feature
(for matchFeatures to use).

If referencing rules accross multiple configs/CRDs care must be taken
with the ordering. Processing order of rules in nfd-worker:

1. Static rules
2. Files from /etc/kubernetes/node-feature-discovery/custom.d/
   in alphabetical order. Subdirectories are processed by reading their
   files in alphabetical order.
3. Custom rules from main nfd-worker.conf

In nfd-master, NodeFeatureRule objects are processed in alphabetical
order (based on their metadata.name).

This patch also adds new 'vars' fields to the rule spec. Like 'labels',
it is a map of key-value pairs but no labels are generated from these.
The values specified in 'vars' are only added for backreferencing into
the 'rules.matched' feature. This may by desired in schemes where the
output of certain rules is only used as intermediate variables for other
rules and no labels out of these are wanted.

An example setup:

  - name: "kernel feature"
    labels:
      kernel-feature:
    matchFeatures:
      - feature: kernel.version
        matchExpressions:
          major: {op: Gt, value: ["4"]}

  - name: "intermediate var feature"
    vars:
      nolabel-feature: "true"
    matchFeatures:
      - feature: cpu.cpuid
        matchExpressions:
          AVX512F: {op: Exists}
      - feature: pci.device
        matchExpressions:
          vendor: {op: In, value: ["8086"]}
          device: {op: In, value: ["1234", "1235"]}

  - name: top-level-feature
    matchFeatures:
      - feature: rule.matched
        matchExpressions:
          kernel-feature: "true"
          nolabel-feature: "true"
2021-11-25 12:50:47 +02:00
Kubernetes Prow Robot
b46a95f7ed
Merge pull request #666 from marquiz/fixes/memory
source/memory: fix memory.numa label
2021-11-24 06:34:20 -08:00
Kubernetes Prow Robot
7e4dd6c8d2
Merge pull request #665 from marquiz/fixes/selinux
source/kernel: don't advertise selinux.enabled=false
2021-11-24 06:26:20 -08:00
Markus Lehtonen
17b094d8d6 source/memory: fix memory.numa label 2021-11-23 20:35:17 +02:00
Markus Lehtonen
d60fcb98e4 source/kernel: don't advertise selinux.enabled=false
Commit 0945019161 changed the behavior so
that NFD started to advertise also "false" status of selinux.enabled
label. This patch reverts this behavior (i.e. we only have
selinux.enabled=true). The rationale behind is avoiding any excess
labels - selinux.enabled=false label would be pointless noise in most
deployments.
2021-11-23 20:12:20 +02:00
Mikko Ylinen
8a39434659 source/cpu: detect Intel SGX
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-11-23 15:57:31 +02:00
Markus Lehtonen
c3da439d21 source/memory: implement FeatureSource
Separate feature discovery and creation of feature labels.

Generalize the discovery of nvdimm devices so that they can be matched
in custom label rules in a similar fashion as pci and usb devices.
Available attributes for matching nvdimm devices are limited to:

- devtype
- mode

For numa we now detect the number of numa nodes which can be matched
agains in custom label rules.

Labels created by the memory feature source are unchanged. The new
features being detected are available in custom rules only.

Example custom rule:

  - name: "my memory rule"
    labels:
      my-memory-feature: "true"
    matchFeatures:
      - feature: memory.numa
        matchExpressions:
          "node_count": {op: Gt, value: ["3"]}
      - feature: memory.nv
        matchExpressions:
          "devtype" {op: In, value: ["nd_dax"]}

Also, add minimalist unit test.
2021-11-23 15:08:15 +02:00
Markus Lehtonen
9a02b544a2 source/network: implement FeatureSource
Separate feature discovery and creation of feature labels. Generalize
the feature discovery so that network devices can be matched in custom
label rules in a similar fashion as pci and usb devices. Available
attributes for matching are:

- operstate
- speed
- sriov_numvfs
- sriov_totalvfs

Labels created by the network feature source are unchanged. The new
features being detected are available in custom rules only.

Example custom rule:

  - name: "my network rule"
    labels:
      my-network-feature: "true"
    matchFeatures:
      - feature: network.device
        matchExpressions:
          "operstate": { op: In, value: ["up"] }
          "sriov_numvfs": { op: Gt, value: ["9"] }

Also, add minimalist unit test.
2021-11-23 10:05:38 +02:00
Markus Lehtonen
999628418b source/storage: implement FeatureSource
Separate feature discovery and creation of feature labels. Generalize
the feature discovery so that block devices can be matched in custom
label rules in a similar fashion as pci and usb devices. This extends
the discovery to other block queue attributes than 'rotational': now we
also detect 'dax', 'nr_zones' and 'zoned'.

Labels created by the storage feature source are unchanged. The new
features being detected are available in custom rules only.

Example custom rules:

  - name: "my block rule 1"
    labels:
      my-block-feature-1: "true"
    matchFeatures:
      - feature: storage.block
          "rotational": {op: In, value: ["0"]}

  - name: "my block rule 2"
    labels:
      my-block-feature-2: "true"
    matchFeatures:
      - feature: storage.block
          "zoned": {op: In, value: [“host-aware”, “host-managed”]}

Also, add minimalist unit test.
2021-11-18 14:58:33 +02:00
Markus Lehtonen
8b9df3cf31 source/custom: move rule matching to pkg/apis/nfd
Move the rule processing of matchFeatures and matchAny from
source/custom package over to pkg/apis/nfd, aiming for better integrity
and re-usability of the code. Does not change the CRD API as such, just
adds more supportive functions.
2021-11-17 14:02:00 +02:00
Markus Lehtonen
f3cc109f99 pkg/apis/nfd: work around issues with k8s deepcopy-gen
Without this hack the generated code does not compile.
2021-11-17 13:40:43 +02:00
Markus Lehtonen
0757248055 source/custom: move rule expressions to pkg/apis/nfd/v1alpha1
Create a new package pkg/apis/nfd/v1alpha1 and migrate the custom rule
expressions over there. This is the first step in creating a new CRD API
for custom rules.
2021-11-16 18:12:16 +02:00
Markus Lehtonen
52d9aa2244 source/custom: improved logging of expression matching
Print out the result of applying an expression. Also, truncate the
output to max 10 elements (of items matched against) unless '-v 4'
verbosity level is in use.
2021-11-12 16:51:30 +02:00
Markus Lehtonen
6cbed379df source/custom: implement matchAny directive
Implement a new 'matchAny' directive in the new rule format, building on
top of the previously implemented 'matchFeatures' matcher. MatchAny
applies a logical OR over multiple matchFeatures directives. That is, it
allows specifying multiple alternative matchers (at least one of which
must match) in a single label rule.

The configuration format for the new matchers is

  matchAny:
    - matchFeatures:
        - feature: <domain>.<feature>
          matchExpressions:
            <attribute>:
              op: <operator>
              value:
                - <list-of-values>
    - matchFeatures:
      ...

A configuration example. In order to require a cpu feature, kernel
module and one of two specific PCI devices (taking use of the shortform
notation):

  - name: multi-device-test
    labels:
      multi-device-feature: "true"
    matchFeatures:
      - feature: kernel.loadedmodule
        matchExpressions: [driver-module]
      - feature: cpu.cpuid
        matchExpressions: [AVX512F]
    matchAny:
      - matchFeatures:
          - feature; pci.device
            matchExpressions:
              vendor: "8086"
              device: "1234"
      - matchFeatures:
          - feature: pci.device
            matchExpressions:
              vendor: "8086"
              device: "abcd"
2021-11-12 16:51:30 +02:00
Markus Lehtonen
e206f0b86b source/custom: implement generic feature matching
Implement generic feature matchers that cover all feature sources (that
implement the FeatureSource interface). The implementation relies on the
unified data model provided by the FeatureSource interface as well as
the generic expression-based rule processing framework that was added to
the source/custom/expression package.

With this patch any new features added will be automatically available
for custom rules, without any additional work. Rule hierarchy follows
the source/feature hierarchy by design.

This patch introduces a new format for custom rule specifications,
dropping the 'value' field and introducing new 'labels' field which
makes it possible to specify multiple labels per rule. Also, in the new
format the 'name' field is just for reference and no matching label is
created. The new generic rules are available in this new rule format
under a 'matchFeatures. MatchFeatures implements a logical AND over
an array of per-feature matchers - i.e. a match for all of the matchers
is required. The goal of the new rule format is to make it better follow
K8s API design guidelines and make it extensible for future enhancements
(e.g. addition of templating, taints, annotations, extended resources
etc).

The old rule format (with cpuID, kConfig, loadedKMod, nodename, pciID,
usbID rules) is still supported. The rule format (new vs. old) is
determined at config parsing time based on the existence of the
'matchOn' field.

The new rule format and the configuration format for the new
matchFeatures field is

  - name: <rule-name>
    labels:
      <key>: <value>
      ...
    matchFeatures:
      - feature: <domain>.<feature>
        matchExpressions:
          <attribute>:
            op: <operator>
            value:
              - <list-of-values>
      - feature: <domain>.<feature>
        ...

Currently, "cpu", "kernel", "pci", "system", "usb" and "local" sources
are covered by the matshers/feature selectors. Thus, the following
features are available for matching with this patch:

  - cpu.cpuid:
      <cpuid-flag>: <exists/does-not-exist>
  - cpu.cstate:
      enabled: <bool>
  - cpu.pstate:
      status: <string>
      turbo: <bool>
      scaling_governor: <string>
  - cpu.rdt:
      <rdt-feature>: <exists/does-not-exist>
  - cpu.sst:
      bf.enabled: <bool>
  - cpu.topology:
      hardware_multithreading: <bool>
  - kernel.config:
      <flag-name>: <string>
  - kernel.loadedmodule:
      <module-name>: <exists/does-not-exist>
  - kernel.selinux:
      enabled: <bool>
  - kernel.version:
      major: <int>
      minor: <int>
      revision: <int>
      full: <string>
  - system.osrelease:
      <key-name>: <string>
      VERSION_ID.major: <int>
      VERSION_ID.minor: <int>
  - system.name:
      nodename: <string>
  - pci.device:
      <device-instance>:
        class: <string>
        vendor: <string>
        device: <string>
        subsystem_vendor: <string>
        susbystem_device: <string>
        sriov_totalvfs: <int>
  - usb.device:
      <device-instance>:
        class: <string>
        vendor: <string>
        device: <string>
        serial: <string>
  - local.label:
      <label-name>: <string>

The configuration also supports some "shortforms" for convenience:

   matchExpressions: [<attr-1>, <attr-2>=<val-2>]
   ---
   matchExpressions:
     <attr-3>:
     <attr-4>: <val-4>

is equal to:

   matchExpressions:
     <attr-1>: {op: Exists}
     <attr-2>: {op: In, value: [<val-2>]}
   ---
   matchExpressions:
     <attr-3>: {op: Exists}
     <attr-4>: {op: In, value: [<val-4>]}

In other words:

  - feature: kernel.config
    matchExpressions: ["X86", "INIT_ENV_ARG_LIMIT=32"]
  - feature: pci.device
    matchExpressions:
      vendor: "8086"

is the same as:

  - feature: kernel.config
    matchExpressions:
      X86: {op: Exists}
      INIT_ENV_ARG_LIMIT: {op: In, values: ["32"]}
  - feature: pci.device
    matchExpressions:
      vendor: {op: In, value: ["8086"]

Some configuration examples below. In order to match a CPUID feature the
following snippet can be used:

  - name: cpu-test-1
    labels:
      cpu-custom-feature: "true"
    matchFeatures:
      - feature: cpu.cpuid
        matchExpressions:
          AESNI: {op: Exists}
          AVX: {op: Exists}

In order to match against a loaded kernel module and OS version:

  - name: kernel-test-1
    labels:
      kernel-custom-feature: "true"
    matchFeatures:
      - feature: kernel.loadedmodule
        matchExpressions:
          e1000: {op: Exists}
      - feature: system.osrelease
        matchExpressions:
          NAME: {op: InRegexp, values: ["^openSUSE"]}
          VERSION_ID.major: {op: Gt, values: ["14"]}

In order to require a kernel module and both of two specific PCI devices:

  - name: multi-device-test
    labels:
      multi-device-feature: "true"
    matchFeatures:
      - feature: kernel.loadedmodule
        matchExpressions:
          driver-module: {op: Exists}
      - pci.device:
          vendor: "8086"
          device: "1234"
      - pci.device:
          vendor: "8086"
          device: "abcd"
2021-11-12 16:51:13 +02:00
Markus Lehtonen
689703be48 source/custom: implement 'GtLt' operator
A new operator for checking that an input (integer) is between two
values.
2021-11-11 19:59:34 +02:00
Markus Lehtonen
8b4314bbbb source/custom: expression based label rules
Implement a framework for more flexible rule configuration and matching,
mimicking the MatchExpressions pattern from K8s nodeselector.

The basic building block is MatchExpression which contains an operator
and a list of values. The operator specifies that "function" that is
applied when evaluating a given input agains the list of values.
Available operators are:

- MatchIn
- MatchNotIn
- MatchInRegexp
- MatchExists
- MatchDoesNotExist
- MatchGt
- MatchLt
- MatchIsTrue
- MatchIsFalse

Another building block of the framework is MatchExpressionSet which is a
map of string-MatchExpression pairs. It is a helper for specifying
multiple expressions that can be matched against a set of set of
features.

This patch converts all existing custom rules to utilize the new
expression-based framework.
2021-11-11 19:59:34 +02:00
Markus Lehtonen
a91f3325ba source/local: implement FeatureSource
Separate feature discovery (i.e. running hooks and reading feature
files) and creation of feature labels in the local source.

Also, add minimalist unit test.
2021-11-11 18:34:01 +02:00
Markus Lehtonen
e225f4aad0 source/system: implement FeatureSource
Separate feature discovery and creation of feature labels in the system
source.

Also, change the implementation of the nodeName custom rule to utilize
the FeatureSource interface of the system source.

Also, add minimalist unit test.
2021-11-11 18:33:58 +02:00
Markus Lehtonen
5cf25dc4e9 source/custom: move kernel module detection to source/kernel
Move the functionality responsible for detection of loeaded kernel
modules from source/custom over to the source/kernel package. Add a new
"loadedmodule" raw feature to the kernel source to store this
information.

Change loadedKmod custom rule to utilize kernel source.
2021-11-11 18:33:58 +02:00
Markus Lehtonen
df27327f14 source/usb: implement FeatureSource
Separate feature discovery and creation of feature labels in the usb
source.

Move usb_utils from source/internal to the source/usb package. Change
the implementation of the UsbID custom rule to utilize the FeatureSource
interface of the usb source.

Also, add minimalist unit test.
2021-11-11 18:33:53 +02:00
Markus Lehtonen
af0c683f60 source/pci: implement FeatureSource
Separate feature discovery and creation of feature labels in the pci
source.

Move pci_utils from source/internal to the source/pci package. Change
the implementation of the PciID custom rule to utilize the FeatureSource
interface of the pci source.

Also, add minimalist unit test.
2021-11-11 18:33:53 +02:00
Markus Lehtonen
03bf94a8ad source/cpu: implement FeatureSource
Convert the cpu source to do feature discovery and creation of feature
labels separately.

Move cpuidutils from source/internal to the source/cpu package. Change
the cpuid custom rule to utilize GetFeatures of the cpu source.

Also, add minimalist unit test.
2021-11-11 18:33:40 +02:00
Markus Lehtonen
0945019161 source/kernel: implement FeatureSource
Separate feature discovery and creation of feature labels in the kernel
source.

Move kernelutils from source/internal back to the source/kernel package.
Change the kconfig custom rule to rely on the FeatureSource interface
(GetFeatures()) of the kernel source.

Also, add minimalist unit test.
2021-11-11 18:33:40 +02:00
Markus Lehtonen
4cfb3203f6 source: fix gofmt errors
The tool got pickier with golang v1.17.
2021-10-22 12:01:31 +03:00
David Gray
3d9b18b087 Trim single quotes in parseOSRelease
Signed-off-by: David Gray <dagray@redhat.com>
2021-09-22 15:04:44 -04:00
Kubernetes Prow Robot
9cf732b64e
Merge pull request #602 from marquiz/devel/go-generate
Utilize go generate
2021-09-21 06:16:24 -07:00
Kubernetes Prow Robot
064391f310
Merge pull request #601 from marquiz/devel/feature-source-interface
source: introduce FeatureSource interface
2021-09-21 05:48:25 -07:00
Markus Lehtonen
51c0d70383 Update auto-generated code
Generated by running "make generate".
2021-09-21 13:37:36 +03:00
Markus Lehtonen
9487fbeb18 Utilize go generate
Use 'go generate' for auto-generating code. Drop the old 'mock' and
'apigen' makefile targets. Those are replaced with a single
  make generate

which (re-)generates everything.
2021-09-21 13:36:37 +03:00
Francesco Romani
b4c92e4eed topologyupdater: Bootstrap nfd-topology-updater in NFD
- This patch allows to expose Resource Hardware Topology information
  through CRDs in Node Feature Discovery.
- In order to do this we introduce another software component called
  nfd-topology-updater in addition to the already existing software
  components nfd-master and nfd-worker.
- nfd-master was enhanced to communicate with nfd-topology-updater
  over gRPC followed by creation of CRs corresponding to the nodes
  in the cluster exposing resource hardware topology information
  of that node.
- Pin kubernetes dependency to one that include pod resource implementation
- This code is responsible for obtaining hardware information from the system
  as well as pod resource information from the Pod Resource API in order to
  determine the allocatable resource information for each NUMA zone. This
  information along with Costs for NUMA zones (obtained by reading NUMA distances)
  is gathered by nfd-topology-updater running on all the nodes
  of the cluster and propagate NUMA zone costs to master in order to populate
  that information in the CRs corresponding to the nodes.
- We use GHW facilities for obtaining system information like CPUs, topology,
  NUMA distances etc.
- This also includes updates made to Makefile and Dockerfile and Manifests for
  deploying nfd-topology-updater.
- This patch includes unit tests
- As part of the Topology Aware Scheduling work, this patch captures
  the configured Topology manager scope in addition to the Topology manager policy.
  Based on the value of both attribues a single string will be populated to the CRD.
  The string value will be on of the following {SingleNUMANodeContainerLevel,
  SingleNUMANodePodLevel, BestEffort, Restricted, None}

Co-Authored-by: Artyom Lukianov <alukiano@redhat.com>
Co-Authored-by: Francesco Romani <fromani@redhat.com>
Co-Authored-by: Talor Itzhak <titzhak@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-09-21 10:47:39 +01:00
Markus Lehtonen
852cf4b61d source: introduce FeatureSource interface
Specify a new interface for managing "raw" feature data. This is the
first step to separate raw feature data from node labels. None of the
feature sources implement this interface, yet.

This patch unifies the data format of "raw" features by dividing them
into three different basic types.
- keys, a set of names without any associated values, e.g. CPUID flags
  or loaded kernel modules
- values, a map of key-value pairs, for features with a single value,
  e.g. kernel config flags or os version
- instances, a list of instances each of which has multiple attributes
  (key-value pairs of their own), e.g. PCI or USB devices

The new feature data types are defined in a new "pkg/api/feature"
package, catering decoupling and re-usability of code e.g. within future
extentions of the NFD gRPC API.

Rename the Discover() method of LabelSource interface to GetLabels().
2021-09-20 09:58:07 +03:00
Markus Lehtonen
81378a3235 source: make sources register themselves
Implement new registration infrastructure under the "source" package.
This change loosens the coupling between label sources and the
nfd-worker, making it easier to refactor and move the code around.

Also, create a separate interface (ConfigurableSource) for configurable
feature sources in order to eliminate boilerplate code.

Add safety checks to the sources that they actually implement the
interfaces they should.

In sake of consistency and predictability (of behavior) change all
methods of the sources to use pointer receivers.

Add simple unit tests for the new functionality and include source/...
into make test target.
2021-09-15 18:41:37 +03:00
Markus Lehtonen
befa7e9796 source: rename FeatureSource to LabelSource
Prepare for separating feature detection from label creation.
2021-09-13 22:48:33 +03:00
Markus Lehtonen
bd5ee9c616 source/network: silence annoying/useless log message
However, log an error if something unexpected happens, i.e. the file to
read maximum number of vfs exists (sriov_totalvfs) but read fails.
2021-09-13 09:40:06 +03:00
Kubernetes Prow Robot
e16d4c9b20
Merge pull request #543 from marquiz/devel/custom-kconfig-refactor
source/custom: refactor kconfig rule internal representation
2021-08-24 06:45:14 -07:00
Jan Schintag
ac0e5b1b52 cstate/pstate: Skip check on non intel arches
Intel driver is not available on other arches, skip check.

Signed-off-by: Jan Schintag <jan.schintag@de.ibm.com>
2021-08-19 16:37:54 +02:00
Markus Lehtonen
43e0f83940 source/cpu: better error reporting
Drop confusing errors in the log when intel pstate or cstate driver is
not enabled in the system. However, we still log an error if sysfs is
not available at all, in which case we're not able to detect these
correctly.
2021-08-13 09:16:03 +03:00
Markus Lehtonen
73704e2e11 source/kernel: better error reporting
Get rid of distracting error in the log in case selinux is not enabled
in the kernel. Still print an error only if sysfs/fs directory is not
available, though, which indicates that we're not able to correctly
detect the presence of selinux.
2021-08-13 09:13:19 +03:00