Simplify the code and reduce possible error scenarios by dropping
fsnotify-based reconfiguration from nfd-master and nfd-worker. Also
eliminates repeated re-configuration in scenarios where kubelet
continuosly touches the (every minute) mounted file (configmap) on the
filesystem.
Also modifies the Helm and kustomize deployments so that nfd-master,
nfd-worker and nfd-topology-updater pods are restarted on configmap
updates. In kustomize, the slght downside of this is the name of the
config map(s) depends on the content, so every time a user customizes
the config data, the old unused configmap will be left and must be
garbage-collected manually.
The feature gate is locked to true. That is, it is not possible to revert
back to the gPRC-based communication which makes the gRPC API ready for
removal.
Disable AVX10 as unnecessary as AVX10_LEVEL is better suited for
checking AVX10 compatibility. There is not yet any hardware with the
feature so disabling it shouldn't cause problems for users.
Add new cpuid label "feature.node.kubernetes.io/cpu-cpuid.AVX10_VERSION"
that advertises the supported version of AVX10 vector ISA.
Correspondingly, the patch adds AVX10_VERSION to the "cpu.cpuid" feature
for NodeFeatureRules to consume.
This makes cpu.cpuid on amd64 architecture a "multi-type" feature in
that it contains "flags" and potentially also "attributes" (the only
cpuid attribute so far is the AVX10_VERSION).
This patch changes the handling of NodeFeatureRules so that one feature
name (say "cpu.cpuid") can hold different types of features (flags,
attributes and/or instances). Requiring features to choose one single
type has not been a limitation of the API itself (and there has been no
validation on this) but an implementation decision.
The new evalutation logic of match expressions is such that "flags" and
"attributes" are basically evaluated as an union - they are both maps
but "flags" just don't have any value associated with the key. However,
"instances" are handled separately as that is basically an array of
maps and needs to be evaluated in a different way (loop over the array
of instances and evaluate expressions against the attributes of each).
Because of this difference care must be taken if mixing "instance"
features with "flag" and/or "attribute" features.
Note that the API types or their validation is not changed - just the
implementation of how the NodeFeatureRules are evaluated.
The NodeFeatureGroup is an NFD-specific custom resource that is designed for
grouping nodes based on their features. NFD-Master watches for NodeFeatureGroup
objects in the cluster and updates the status of the NodeFeatureGroup object
with the list of nodes that match the feature group rules. The NodeFeatureGroup
rules follow the same syntax as the NodeFeatureRule rules.
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Extend the format of feature matcher terms (the elements of the
arrayspecified under under matchFeatures field) with new matchName
field. The value of this field is an expression that is evaluated
against the names of feature elements instead of their values (values
are matched with the matchExpressions field, instead).
The matchName field is useful e.g. in template rules for creating
per-feature-element labels based on feature names (instead of values)
and in non-template rules for checking if (at least) one of certain
feature element names are present.
If both matchExpressions and matchName for certain feature matcher term
is specified, they both must match in order to get an overall match.
Also, in this case the list of matched features (used in templating) is
the union of the results from matchExpressions and matchName.
An example of creating an "avx512" label if any AVX512* CPUID feature is
present:
- name: "avx wildcard rule"
labels:
avx512: "true"
matchFeatures:
- feature: cpu.cpuid
matchName: {op: InRegexp, value: ["^AVX512"]}
An example of a template rule creating a dynamic set of labels based on
the existence of certain kconfig options.
- name: "kconfig template rule"
labelsTemplate: |
{{ range .kernel.config }}kconfig-{{ .Name }}={{ .Value }}
{{ end }}
matchFeatures:
- feature: kernel.config
matchName: {op: In, value: ["SWAP", "X86", "ARM"]}
NOTE: this patch changes the corner case of nil/null match expressions
with instance features (i.e. "matchExpressions: null"). Previously, we
returned all instances for templating but now a nil match expression is
not evaluated and no instances for templating are returned.