1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00
Commit graph

918 commits

Author SHA1 Message Date
Markus Lehtonen
6cbed379df source/custom: implement matchAny directive
Implement a new 'matchAny' directive in the new rule format, building on
top of the previously implemented 'matchFeatures' matcher. MatchAny
applies a logical OR over multiple matchFeatures directives. That is, it
allows specifying multiple alternative matchers (at least one of which
must match) in a single label rule.

The configuration format for the new matchers is

  matchAny:
    - matchFeatures:
        - feature: <domain>.<feature>
          matchExpressions:
            <attribute>:
              op: <operator>
              value:
                - <list-of-values>
    - matchFeatures:
      ...

A configuration example. In order to require a cpu feature, kernel
module and one of two specific PCI devices (taking use of the shortform
notation):

  - name: multi-device-test
    labels:
      multi-device-feature: "true"
    matchFeatures:
      - feature: kernel.loadedmodule
        matchExpressions: [driver-module]
      - feature: cpu.cpuid
        matchExpressions: [AVX512F]
    matchAny:
      - matchFeatures:
          - feature; pci.device
            matchExpressions:
              vendor: "8086"
              device: "1234"
      - matchFeatures:
          - feature: pci.device
            matchExpressions:
              vendor: "8086"
              device: "abcd"
2021-11-12 16:51:30 +02:00
Markus Lehtonen
e206f0b86b source/custom: implement generic feature matching
Implement generic feature matchers that cover all feature sources (that
implement the FeatureSource interface). The implementation relies on the
unified data model provided by the FeatureSource interface as well as
the generic expression-based rule processing framework that was added to
the source/custom/expression package.

With this patch any new features added will be automatically available
for custom rules, without any additional work. Rule hierarchy follows
the source/feature hierarchy by design.

This patch introduces a new format for custom rule specifications,
dropping the 'value' field and introducing new 'labels' field which
makes it possible to specify multiple labels per rule. Also, in the new
format the 'name' field is just for reference and no matching label is
created. The new generic rules are available in this new rule format
under a 'matchFeatures. MatchFeatures implements a logical AND over
an array of per-feature matchers - i.e. a match for all of the matchers
is required. The goal of the new rule format is to make it better follow
K8s API design guidelines and make it extensible for future enhancements
(e.g. addition of templating, taints, annotations, extended resources
etc).

The old rule format (with cpuID, kConfig, loadedKMod, nodename, pciID,
usbID rules) is still supported. The rule format (new vs. old) is
determined at config parsing time based on the existence of the
'matchOn' field.

The new rule format and the configuration format for the new
matchFeatures field is

  - name: <rule-name>
    labels:
      <key>: <value>
      ...
    matchFeatures:
      - feature: <domain>.<feature>
        matchExpressions:
          <attribute>:
            op: <operator>
            value:
              - <list-of-values>
      - feature: <domain>.<feature>
        ...

Currently, "cpu", "kernel", "pci", "system", "usb" and "local" sources
are covered by the matshers/feature selectors. Thus, the following
features are available for matching with this patch:

  - cpu.cpuid:
      <cpuid-flag>: <exists/does-not-exist>
  - cpu.cstate:
      enabled: <bool>
  - cpu.pstate:
      status: <string>
      turbo: <bool>
      scaling_governor: <string>
  - cpu.rdt:
      <rdt-feature>: <exists/does-not-exist>
  - cpu.sst:
      bf.enabled: <bool>
  - cpu.topology:
      hardware_multithreading: <bool>
  - kernel.config:
      <flag-name>: <string>
  - kernel.loadedmodule:
      <module-name>: <exists/does-not-exist>
  - kernel.selinux:
      enabled: <bool>
  - kernel.version:
      major: <int>
      minor: <int>
      revision: <int>
      full: <string>
  - system.osrelease:
      <key-name>: <string>
      VERSION_ID.major: <int>
      VERSION_ID.minor: <int>
  - system.name:
      nodename: <string>
  - pci.device:
      <device-instance>:
        class: <string>
        vendor: <string>
        device: <string>
        subsystem_vendor: <string>
        susbystem_device: <string>
        sriov_totalvfs: <int>
  - usb.device:
      <device-instance>:
        class: <string>
        vendor: <string>
        device: <string>
        serial: <string>
  - local.label:
      <label-name>: <string>

The configuration also supports some "shortforms" for convenience:

   matchExpressions: [<attr-1>, <attr-2>=<val-2>]
   ---
   matchExpressions:
     <attr-3>:
     <attr-4>: <val-4>

is equal to:

   matchExpressions:
     <attr-1>: {op: Exists}
     <attr-2>: {op: In, value: [<val-2>]}
   ---
   matchExpressions:
     <attr-3>: {op: Exists}
     <attr-4>: {op: In, value: [<val-4>]}

In other words:

  - feature: kernel.config
    matchExpressions: ["X86", "INIT_ENV_ARG_LIMIT=32"]
  - feature: pci.device
    matchExpressions:
      vendor: "8086"

is the same as:

  - feature: kernel.config
    matchExpressions:
      X86: {op: Exists}
      INIT_ENV_ARG_LIMIT: {op: In, values: ["32"]}
  - feature: pci.device
    matchExpressions:
      vendor: {op: In, value: ["8086"]

Some configuration examples below. In order to match a CPUID feature the
following snippet can be used:

  - name: cpu-test-1
    labels:
      cpu-custom-feature: "true"
    matchFeatures:
      - feature: cpu.cpuid
        matchExpressions:
          AESNI: {op: Exists}
          AVX: {op: Exists}

In order to match against a loaded kernel module and OS version:

  - name: kernel-test-1
    labels:
      kernel-custom-feature: "true"
    matchFeatures:
      - feature: kernel.loadedmodule
        matchExpressions:
          e1000: {op: Exists}
      - feature: system.osrelease
        matchExpressions:
          NAME: {op: InRegexp, values: ["^openSUSE"]}
          VERSION_ID.major: {op: Gt, values: ["14"]}

In order to require a kernel module and both of two specific PCI devices:

  - name: multi-device-test
    labels:
      multi-device-feature: "true"
    matchFeatures:
      - feature: kernel.loadedmodule
        matchExpressions:
          driver-module: {op: Exists}
      - pci.device:
          vendor: "8086"
          device: "1234"
      - pci.device:
          vendor: "8086"
          device: "abcd"
2021-11-12 16:51:13 +02:00
Kubernetes Prow Robot
cfc1c82746
Merge pull request #639 from marquiz/devel/matchexpression-rules
source/custom: expression based label rules
2021-11-12 06:20:28 -08:00
Kubernetes Prow Robot
52ba742d69
Merge pull request #648 from uniemimu/yatypofix
More topology updater documentation typo fixes
2021-11-12 04:32:27 -08:00
Ukri Niemimuukko
90598d3b5a More topology updater documentation typo fixes
Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2021-11-12 14:25:32 +02:00
Kubernetes Prow Robot
b26a12cc17
Merge pull request #640 from eliaskoromilas/worker-config
deployment: Implicitly generate the worker ConfigMap name
2021-11-12 03:38:28 -08:00
Kubernetes Prow Robot
8351887465
Merge pull request #645 from uniemimu/typofix
Topology-updater introduction typo fix
2021-11-12 02:44:28 -08:00
Ukri Niemimuukko
0a2e3bb18d Topology-updater introduction typo fix
Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2021-11-12 12:10:33 +02:00
Markus Lehtonen
689703be48 source/custom: implement 'GtLt' operator
A new operator for checking that an input (integer) is between two
values.
2021-11-11 19:59:34 +02:00
Markus Lehtonen
8b4314bbbb source/custom: expression based label rules
Implement a framework for more flexible rule configuration and matching,
mimicking the MatchExpressions pattern from K8s nodeselector.

The basic building block is MatchExpression which contains an operator
and a list of values. The operator specifies that "function" that is
applied when evaluating a given input agains the list of values.
Available operators are:

- MatchIn
- MatchNotIn
- MatchInRegexp
- MatchExists
- MatchDoesNotExist
- MatchGt
- MatchLt
- MatchIsTrue
- MatchIsFalse

Another building block of the framework is MatchExpressionSet which is a
map of string-MatchExpression pairs. It is a helper for specifying
multiple expressions that can be matched against a set of set of
features.

This patch converts all existing custom rules to utilize the new
expression-based framework.
2021-11-11 19:59:34 +02:00
Kubernetes Prow Robot
5299ca2ab4
Merge pull request #604 from marquiz/devel/feature-source-conversion
source: implement FeatureSource interface
2021-11-11 09:20:08 -08:00
Markus Lehtonen
a91f3325ba source/local: implement FeatureSource
Separate feature discovery (i.e. running hooks and reading feature
files) and creation of feature labels in the local source.

Also, add minimalist unit test.
2021-11-11 18:34:01 +02:00
Markus Lehtonen
e225f4aad0 source/system: implement FeatureSource
Separate feature discovery and creation of feature labels in the system
source.

Also, change the implementation of the nodeName custom rule to utilize
the FeatureSource interface of the system source.

Also, add minimalist unit test.
2021-11-11 18:33:58 +02:00
Markus Lehtonen
5cf25dc4e9 source/custom: move kernel module detection to source/kernel
Move the functionality responsible for detection of loeaded kernel
modules from source/custom over to the source/kernel package. Add a new
"loadedmodule" raw feature to the kernel source to store this
information.

Change loadedKmod custom rule to utilize kernel source.
2021-11-11 18:33:58 +02:00
Markus Lehtonen
df27327f14 source/usb: implement FeatureSource
Separate feature discovery and creation of feature labels in the usb
source.

Move usb_utils from source/internal to the source/usb package. Change
the implementation of the UsbID custom rule to utilize the FeatureSource
interface of the usb source.

Also, add minimalist unit test.
2021-11-11 18:33:53 +02:00
Markus Lehtonen
af0c683f60 source/pci: implement FeatureSource
Separate feature discovery and creation of feature labels in the pci
source.

Move pci_utils from source/internal to the source/pci package. Change
the implementation of the PciID custom rule to utilize the FeatureSource
interface of the pci source.

Also, add minimalist unit test.
2021-11-11 18:33:53 +02:00
Markus Lehtonen
03bf94a8ad source/cpu: implement FeatureSource
Convert the cpu source to do feature discovery and creation of feature
labels separately.

Move cpuidutils from source/internal to the source/cpu package. Change
the cpuid custom rule to utilize GetFeatures of the cpu source.

Also, add minimalist unit test.
2021-11-11 18:33:40 +02:00
Markus Lehtonen
0945019161 source/kernel: implement FeatureSource
Separate feature discovery and creation of feature labels in the kernel
source.

Move kernelutils from source/internal back to the source/kernel package.
Change the kconfig custom rule to rely on the FeatureSource interface
(GetFeatures()) of the kernel source.

Also, add minimalist unit test.
2021-11-11 18:33:40 +02:00
Markus Lehtonen
dd92c9a9ce pkg/api/feature: revert back to structs instead of pointers
Less error prone, as no chance for a nil pointer dereference.
2021-11-11 17:56:55 +02:00
Kubernetes Prow Robot
67330e1441
Merge pull request #644 from marquiz/devel/e2e-boot-mount
test/e2e: drop /boot mount
2021-11-10 11:13:27 -08:00
Kubernetes Prow Robot
54b5e43b2c
Merge pull request #643 from marquiz/devel/e2e-single-node
test/e2e: make e2e tests run on single-node cluster
2021-11-10 11:07:27 -08:00
Markus Lehtonen
261ab113bf test/e2e: drop /boot mount
This is not currently needed by end-to-end tests. Dropping it enables
testing in restricted environments that don't have /boot directory.
2021-11-10 20:58:25 +02:00
Markus Lehtonen
0161bd5ca4 test/e2e: make e2e tests run on single-node cluster
Lift the restriction to run custom rule tests on non-master node. Try to
find one but do not fail if that fails. Makes the end-to-end tests
runnable on single-node clusters such a simple minikube deployments.
2021-11-10 20:33:55 +02:00
Kubernetes Prow Robot
d957a9e4fb
Merge pull request #642 from marquiz/devel/feature-api
pkg/api/feature: small improvements
2021-11-09 05:39:47 -08:00
Markus Lehtonen
9bff4b3185 pkg/api/feature: generator functions with initial values
Flavor the generator helper functions with arguments for specifying the
set of features to put into the generated objects.
2021-11-09 13:40:35 +02:00
Markus Lehtonen
5de4d8857c pkg/api/feature: use pointers of structs
Make it easier to mutate the feature sets.
2021-11-09 12:15:38 +02:00
Kubernetes Prow Robot
30f641847e
Merge pull request #641 from marquiz/fixes/typo
pkg/resourcemonitor: fix typo in comment
2021-11-05 08:07:53 -07:00
Markus Lehtonen
25711799f3 pkg/resourcemonitor: fix typo in comment 2021-11-05 16:42:49 +02:00
Kubernetes Prow Robot
376174ff41
Merge pull request #593 from cynepco3hahue/from_swatisehgal_add_memory_information_under_resource
resourcemonitor: aggregate and provide the memory and hugepages information
2021-11-05 07:27:52 -07:00
Artyom Lukianov
45062754fd resourcemonitor: aggregate and provide the memory and hugepages information
The Kuberenetes pod resource API now exposing the memory and hugepages information
for guaranteed pods. We can use this information to update NodeResourceTopology
resource with memory and hugepages data.

Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-11-04 10:17:10 +02:00
Artyom Lukianov
a93b660f7c utils: add methods to fetch NUMA nodes hugepages and memory capacity
The methods are used during calculation of reserved memory for system workloads.
The calcualation is `resourceCapacity - resourceAllocatable`.

Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-11-04 10:14:51 +02:00
Elias Koromilas
e22b937391 Implicitly generate the worker ConfigMap name
Signed-off-by: Elias Koromilas <elias.koromilas@gmail.com>
2021-11-03 11:21:58 +02:00
Kubernetes Prow Robot
347b16daea
Merge pull request #526 from k8stopologyawareschedwg/topology-updater-documentation
Documentation capturing enablement of NFD-Topology-Updater in NFD
2021-10-29 04:54:50 -07:00
Swati Sehgal
ab62172a8d Documentation capturing enablement of NFD-Topology-Updater in NFD
Prior to this feature, NFD consisted of only software components namely
nfd-master and nfd-worker. We have introduced another software component
called nfd-topology-updater.

NFD-Topology-Updater is a daemon responsible for examining allocated resources
on a worker node to account for allocatable resources on a per-zone basis (where
a zone can be a NUMA node). It then communicates the information to nfd-master
which does the CRD creation corresponding to all the nodes in the cluster. One
instance of nfd-topology-updater is supposed to be running on each node of the
cluster.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-10-29 10:14:38 +01:00
Kubernetes Prow Robot
3e9c13a858
Merge pull request #635 from marquiz/documentation/kubectl
docs: mention minimum required kubectl version
2021-10-26 08:27:30 -07:00
Markus Lehtonen
9e9ff951b2 docs: mention minimum required kubectl version
Kubectl prior to v1.21 contains too old version of kustomize for our
(kustomize-based) deployment to work.
2021-10-26 18:01:25 +03:00
Kubernetes Prow Robot
661d326458
Merge pull request #623 from zwpaper/master
deployment: add topology updater helm chart
2021-10-25 23:39:32 -07:00
Wei Zhang
158a5590ab deployment: add topology updater helm chart
Signed-off-by: Wei Zhang <kweizh@gmail.com>
2021-10-26 10:52:40 +08:00
Elias Koromilas
c17a898c4c
deployment: Simplify NFD worker configuration in Helm (#627)
* Simplify NFD worker service configuration in Helm

Signed-off-by: Elias Koromilas <elias.koromilas@gmail.com>

* Update docs/get-started/deployment-and-usage.md

Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>

Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
2021-10-25 09:34:23 -07:00
Kubernetes Prow Robot
83f69e3f92
Merge pull request #632 from marquiz/devel/gofmt
Makefile: let gofmt-verify write changes back to files
2021-10-22 06:14:37 -07:00
Kubernetes Prow Robot
d3d98d86ef
Merge pull request #631 from marquiz/fixes/gofmt
source: fix gofmt errors
2021-10-22 04:34:37 -07:00
Markus Lehtonen
5c0d98e07a Makefile: let gofmt-verify write changes back to files
Let gofmt write the result (suggested changes) back to the source files,
instead of just printing out the diff. Reduces manual work.
2021-10-22 12:03:24 +03:00
Markus Lehtonen
4cfb3203f6 source: fix gofmt errors
The tool got pickier with golang v1.17.
2021-10-22 12:01:31 +03:00
Kubernetes Prow Robot
bf8a1a217a
Merge pull request #629 from marquiz/devel/go-117
Bump to golang v1.17
2021-10-21 04:32:10 -07:00
Markus Lehtonen
9d1eea243b Bump to golang v1.17 2021-10-21 14:16:55 +03:00
Kubernetes Prow Robot
f46b4e0e03
Merge pull request #628 from marquiz/devel/helm-fix
deployment/helm: don't force sleep-interval in  worker cmdline flags
2021-10-21 01:54:09 -07:00
Markus Lehtonen
890d9455f1 deployment/helm: don't force sleep-interval in worker cmdline flags
Drop --sleep-interval from the template. We really don't want to do that
as. First, it's the default value so no use repeating that in the
template. And more importantly, the commandline flag will override
anything that will be provided in the worker config file, making it
impossible for users to specify the sleep interval (other than by
editing the template directly).
2021-10-21 11:33:19 +03:00
Kubernetes Prow Robot
93a0a9f14a
Merge pull request #624 from marquiz/docs/jekyll-theme
docs: update dependencies
2021-10-19 07:11:03 -07:00
Kubernetes Prow Robot
6f0948efc5
Merge pull request #625 from Tal-or/fix_klog
topology-updater:fix klog initialization
2021-10-12 03:19:47 -07:00
Talor Itzhak
674720e922 topology-updater:fix klog initialization
We should use the same flag set for both program and klog arguments.
Otherwise we won't be able to provide klog flags properly

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2021-10-11 21:36:54 +03:00