Separate feature discovery and creation of feature labels.
Generalize the discovery of nvdimm devices so that they can be matched
in custom label rules in a similar fashion as pci and usb devices.
Available attributes for matching nvdimm devices are limited to:
- devtype
- mode
For numa we now detect the number of numa nodes which can be matched
agains in custom label rules.
Labels created by the memory feature source are unchanged. The new
features being detected are available in custom rules only.
Example custom rule:
- name: "my memory rule"
labels:
my-memory-feature: "true"
matchFeatures:
- feature: memory.numa
matchExpressions:
"node_count": {op: Gt, value: ["3"]}
- feature: memory.nv
matchExpressions:
"devtype" {op: In, value: ["nd_dax"]}
Also, add minimalist unit test.
Separate feature discovery and creation of feature labels. Generalize
the feature discovery so that network devices can be matched in custom
label rules in a similar fashion as pci and usb devices. Available
attributes for matching are:
- operstate
- speed
- sriov_numvfs
- sriov_totalvfs
Labels created by the network feature source are unchanged. The new
features being detected are available in custom rules only.
Example custom rule:
- name: "my network rule"
labels:
my-network-feature: "true"
matchFeatures:
- feature: network.device
matchExpressions:
"operstate": { op: In, value: ["up"] }
"sriov_numvfs": { op: Gt, value: ["9"] }
Also, add minimalist unit test.
Enable Custom Resource based label creation in nfd-master. This extends
the previously implemented controller stub for watching NodeFeatureRule
objects. NFD-master watches NodeFeatureRule objects in the cluster and
processes the rules on every incoming labeling request from workers.
The functionality relies on the "raw features" (identical to how
nfd-worker handles custom rules) submitted by nfd-worker, making it
independent of the label source configuration of the worker. This means
that the labeling functions as expected even if all sources in the
worker are disabled.
NOTE: nfd-master is stateless and re-labeling only happens on the
reception of SetLabelsRequest from the worker – i.e. on intervals
specified by the core.sleepInterval configuration option (or
-sleep-interval cmdline flag) of each nfd-worker instance. This means
that modification/creation of NodeFeatureRule objects does not
automatically update the node labels. Instead, the changes only come
visible when workers send their labeling requests.
Add a new command line flag for disabling/enabling the controller for
NodeFeatureRule objects. In practice, disabling the controller disables
all labels generated from rules in NodeFeatureRule objects.
Implement a simple controller stub that operates on NodeFeatureRule
objects. The controller does not yet have any functionality other than
logging changes in the (NodeFeatureRule) objecs it is watching.
Also update the documentation on the -no-publish flag to match the new
functionality.
Separate feature discovery and creation of feature labels. Generalize
the feature discovery so that block devices can be matched in custom
label rules in a similar fashion as pci and usb devices. This extends
the discovery to other block queue attributes than 'rotational': now we
also detect 'dax', 'nr_zones' and 'zoned'.
Labels created by the storage feature source are unchanged. The new
features being detected are available in custom rules only.
Example custom rules:
- name: "my block rule 1"
labels:
my-block-feature-1: "true"
matchFeatures:
- feature: storage.block
"rotational": {op: In, value: ["0"]}
- name: "my block rule 2"
labels:
my-block-feature-2: "true"
matchFeatures:
- feature: storage.block
"zoned": {op: In, value: [“host-aware”, “host-managed”]}
Also, add minimalist unit test.
Add auto-generated code for interfacing our CRD API. On top of this, a
CR controller can be implemented. This patch uses k8s/code-generator
for code generation. Run "make generate" in order to (re-)generate
everything. Path to the code-generator repository may need to be
specified:
K8S_CODE_GENERATOR=path/to/code-generator make apigen
Code-generator version 0.20.7 was used to create this patch. Install
k8s code-generator tools and clone the repo with:
git clone https://github.com/kubernetes/code-generator -b v0.20.7 <path/to/code-generator>
go install k8s.io/code-generator/cmd/...(at)v0.20.7
Move the rule processing of matchFeatures and matchAny from
source/custom package over to pkg/apis/nfd, aiming for better integrity
and re-usability of the code. Does not change the CRD API as such, just
adds more supportive functions.
Having a dedicated type makes it possible to specify deepcopy functions
for it. We need to do this manually as deepcopy-gen doesn't know how to
create copies of regexps.
Add a cluster-scoped Custom Resource Definition for specifying labeling
rules. Nodes (node features, node objects) are cluster-level objects and
thus the natural and encouraged setup is to only have one NFD deployment
per cluster - the set of underlying features of the node stays the same
independent of how many parallel NFD deployments you have. Our extension
points (hooks, feature files and now CRs) can be be used by multiple
actors (depending on us) simultaneously. Having the CRD cluster-scoped
hopefully drives deployments in this direction. It also should make
deployment of vendor-specific labeling rules easy as there is no need to
worry about the namespace.
This patch virtually replicates the source.custom.FeatureSpec in a CRD
API (located in the pkg/apis/nfd/v1alpha1 package) with the notable
exception that "MatchOn" legacy rules are not supported. Legacy rules
are left out in order to keep the CRD simple and clean.
The duplicate functionality in source/custom will be dropped by upcoming
patches.
This patch utilizes controller-gen (from sigs.k8s.io/controller-tools)
for generating the CRD and deepcopy methods. Code can be (re-)generated
with "make generate". Install controller-gen with:
go install sigs.k8s.io/controller-tools/cmd/controller-gen@v0.7.0
Update kustomize and helm deployments to deploy the CRD.
Create a new package pkg/apis/nfd/v1alpha1 and migrate the custom rule
expressions over there. This is the first step in creating a new CRD API
for custom rules.
Enable transmitting the discovered "raw" features over the gRPC API.
Extend pkg/api/feature with protobuf and gRPC code. In this, utilize
go-to-protobuf from k8s code-generator for auto-generating the gRPC
interface from golang code. The tool can be Installed with:
go install k8s.io/code-generator/cmd/go-to-protobuf@v0.20.7
The auto-generated code is (re-)generated/updated with "make apigen".
The NodeResourceTopology API has been made cluster
scoped as in the current context a CR corresponds to
a Node and since Node is a cluster scoped resource it
makes sense to make NRT cluster scoped as well.
Ref: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/18
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Print out the result of applying an expression. Also, truncate the
output to max 10 elements (of items matched against) unless '-v 4'
verbosity level is in use.
Implement a new 'matchAny' directive in the new rule format, building on
top of the previously implemented 'matchFeatures' matcher. MatchAny
applies a logical OR over multiple matchFeatures directives. That is, it
allows specifying multiple alternative matchers (at least one of which
must match) in a single label rule.
The configuration format for the new matchers is
matchAny:
- matchFeatures:
- feature: <domain>.<feature>
matchExpressions:
<attribute>:
op: <operator>
value:
- <list-of-values>
- matchFeatures:
...
A configuration example. In order to require a cpu feature, kernel
module and one of two specific PCI devices (taking use of the shortform
notation):
- name: multi-device-test
labels:
multi-device-feature: "true"
matchFeatures:
- feature: kernel.loadedmodule
matchExpressions: [driver-module]
- feature: cpu.cpuid
matchExpressions: [AVX512F]
matchAny:
- matchFeatures:
- feature; pci.device
matchExpressions:
vendor: "8086"
device: "1234"
- matchFeatures:
- feature: pci.device
matchExpressions:
vendor: "8086"
device: "abcd"
Implement generic feature matchers that cover all feature sources (that
implement the FeatureSource interface). The implementation relies on the
unified data model provided by the FeatureSource interface as well as
the generic expression-based rule processing framework that was added to
the source/custom/expression package.
With this patch any new features added will be automatically available
for custom rules, without any additional work. Rule hierarchy follows
the source/feature hierarchy by design.
This patch introduces a new format for custom rule specifications,
dropping the 'value' field and introducing new 'labels' field which
makes it possible to specify multiple labels per rule. Also, in the new
format the 'name' field is just for reference and no matching label is
created. The new generic rules are available in this new rule format
under a 'matchFeatures. MatchFeatures implements a logical AND over
an array of per-feature matchers - i.e. a match for all of the matchers
is required. The goal of the new rule format is to make it better follow
K8s API design guidelines and make it extensible for future enhancements
(e.g. addition of templating, taints, annotations, extended resources
etc).
The old rule format (with cpuID, kConfig, loadedKMod, nodename, pciID,
usbID rules) is still supported. The rule format (new vs. old) is
determined at config parsing time based on the existence of the
'matchOn' field.
The new rule format and the configuration format for the new
matchFeatures field is
- name: <rule-name>
labels:
<key>: <value>
...
matchFeatures:
- feature: <domain>.<feature>
matchExpressions:
<attribute>:
op: <operator>
value:
- <list-of-values>
- feature: <domain>.<feature>
...
Currently, "cpu", "kernel", "pci", "system", "usb" and "local" sources
are covered by the matshers/feature selectors. Thus, the following
features are available for matching with this patch:
- cpu.cpuid:
<cpuid-flag>: <exists/does-not-exist>
- cpu.cstate:
enabled: <bool>
- cpu.pstate:
status: <string>
turbo: <bool>
scaling_governor: <string>
- cpu.rdt:
<rdt-feature>: <exists/does-not-exist>
- cpu.sst:
bf.enabled: <bool>
- cpu.topology:
hardware_multithreading: <bool>
- kernel.config:
<flag-name>: <string>
- kernel.loadedmodule:
<module-name>: <exists/does-not-exist>
- kernel.selinux:
enabled: <bool>
- kernel.version:
major: <int>
minor: <int>
revision: <int>
full: <string>
- system.osrelease:
<key-name>: <string>
VERSION_ID.major: <int>
VERSION_ID.minor: <int>
- system.name:
nodename: <string>
- pci.device:
<device-instance>:
class: <string>
vendor: <string>
device: <string>
subsystem_vendor: <string>
susbystem_device: <string>
sriov_totalvfs: <int>
- usb.device:
<device-instance>:
class: <string>
vendor: <string>
device: <string>
serial: <string>
- local.label:
<label-name>: <string>
The configuration also supports some "shortforms" for convenience:
matchExpressions: [<attr-1>, <attr-2>=<val-2>]
---
matchExpressions:
<attr-3>:
<attr-4>: <val-4>
is equal to:
matchExpressions:
<attr-1>: {op: Exists}
<attr-2>: {op: In, value: [<val-2>]}
---
matchExpressions:
<attr-3>: {op: Exists}
<attr-4>: {op: In, value: [<val-4>]}
In other words:
- feature: kernel.config
matchExpressions: ["X86", "INIT_ENV_ARG_LIMIT=32"]
- feature: pci.device
matchExpressions:
vendor: "8086"
is the same as:
- feature: kernel.config
matchExpressions:
X86: {op: Exists}
INIT_ENV_ARG_LIMIT: {op: In, values: ["32"]}
- feature: pci.device
matchExpressions:
vendor: {op: In, value: ["8086"]
Some configuration examples below. In order to match a CPUID feature the
following snippet can be used:
- name: cpu-test-1
labels:
cpu-custom-feature: "true"
matchFeatures:
- feature: cpu.cpuid
matchExpressions:
AESNI: {op: Exists}
AVX: {op: Exists}
In order to match against a loaded kernel module and OS version:
- name: kernel-test-1
labels:
kernel-custom-feature: "true"
matchFeatures:
- feature: kernel.loadedmodule
matchExpressions:
e1000: {op: Exists}
- feature: system.osrelease
matchExpressions:
NAME: {op: InRegexp, values: ["^openSUSE"]}
VERSION_ID.major: {op: Gt, values: ["14"]}
In order to require a kernel module and both of two specific PCI devices:
- name: multi-device-test
labels:
multi-device-feature: "true"
matchFeatures:
- feature: kernel.loadedmodule
matchExpressions:
driver-module: {op: Exists}
- pci.device:
vendor: "8086"
device: "1234"
- pci.device:
vendor: "8086"
device: "abcd"
Implement a framework for more flexible rule configuration and matching,
mimicking the MatchExpressions pattern from K8s nodeselector.
The basic building block is MatchExpression which contains an operator
and a list of values. The operator specifies that "function" that is
applied when evaluating a given input agains the list of values.
Available operators are:
- MatchIn
- MatchNotIn
- MatchInRegexp
- MatchExists
- MatchDoesNotExist
- MatchGt
- MatchLt
- MatchIsTrue
- MatchIsFalse
Another building block of the framework is MatchExpressionSet which is a
map of string-MatchExpression pairs. It is a helper for specifying
multiple expressions that can be matched against a set of set of
features.
This patch converts all existing custom rules to utilize the new
expression-based framework.