Implement a framework for more flexible rule configuration and matching,
mimicking the MatchExpressions pattern from K8s nodeselector.
The basic building block is MatchExpression which contains an operator
and a list of values. The operator specifies that "function" that is
applied when evaluating a given input agains the list of values.
Available operators are:
- MatchIn
- MatchNotIn
- MatchInRegexp
- MatchExists
- MatchDoesNotExist
- MatchGt
- MatchLt
- MatchIsTrue
- MatchIsFalse
Another building block of the framework is MatchExpressionSet which is a
map of string-MatchExpression pairs. It is a helper for specifying
multiple expressions that can be matched against a set of set of
features.
This patch converts all existing custom rules to utilize the new
expression-based framework.
Separate feature discovery (i.e. running hooks and reading feature
files) and creation of feature labels in the local source.
Also, add minimalist unit test.
Separate feature discovery and creation of feature labels in the system
source.
Also, change the implementation of the nodeName custom rule to utilize
the FeatureSource interface of the system source.
Also, add minimalist unit test.
Move the functionality responsible for detection of loeaded kernel
modules from source/custom over to the source/kernel package. Add a new
"loadedmodule" raw feature to the kernel source to store this
information.
Change loadedKmod custom rule to utilize kernel source.
Separate feature discovery and creation of feature labels in the usb
source.
Move usb_utils from source/internal to the source/usb package. Change
the implementation of the UsbID custom rule to utilize the FeatureSource
interface of the usb source.
Also, add minimalist unit test.
Separate feature discovery and creation of feature labels in the pci
source.
Move pci_utils from source/internal to the source/pci package. Change
the implementation of the PciID custom rule to utilize the FeatureSource
interface of the pci source.
Also, add minimalist unit test.
Convert the cpu source to do feature discovery and creation of feature
labels separately.
Move cpuidutils from source/internal to the source/cpu package. Change
the cpuid custom rule to utilize GetFeatures of the cpu source.
Also, add minimalist unit test.
Separate feature discovery and creation of feature labels in the kernel
source.
Move kernelutils from source/internal back to the source/kernel package.
Change the kconfig custom rule to rely on the FeatureSource interface
(GetFeatures()) of the kernel source.
Also, add minimalist unit test.
Use 'go generate' for auto-generating code. Drop the old 'mock' and
'apigen' makefile targets. Those are replaced with a single
make generate
which (re-)generates everything.
- This patch allows to expose Resource Hardware Topology information
through CRDs in Node Feature Discovery.
- In order to do this we introduce another software component called
nfd-topology-updater in addition to the already existing software
components nfd-master and nfd-worker.
- nfd-master was enhanced to communicate with nfd-topology-updater
over gRPC followed by creation of CRs corresponding to the nodes
in the cluster exposing resource hardware topology information
of that node.
- Pin kubernetes dependency to one that include pod resource implementation
- This code is responsible for obtaining hardware information from the system
as well as pod resource information from the Pod Resource API in order to
determine the allocatable resource information for each NUMA zone. This
information along with Costs for NUMA zones (obtained by reading NUMA distances)
is gathered by nfd-topology-updater running on all the nodes
of the cluster and propagate NUMA zone costs to master in order to populate
that information in the CRs corresponding to the nodes.
- We use GHW facilities for obtaining system information like CPUs, topology,
NUMA distances etc.
- This also includes updates made to Makefile and Dockerfile and Manifests for
deploying nfd-topology-updater.
- This patch includes unit tests
- As part of the Topology Aware Scheduling work, this patch captures
the configured Topology manager scope in addition to the Topology manager policy.
Based on the value of both attribues a single string will be populated to the CRD.
The string value will be on of the following {SingleNUMANodeContainerLevel,
SingleNUMANodePodLevel, BestEffort, Restricted, None}
Co-Authored-by: Artyom Lukianov <alukiano@redhat.com>
Co-Authored-by: Francesco Romani <fromani@redhat.com>
Co-Authored-by: Talor Itzhak <titzhak@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Specify a new interface for managing "raw" feature data. This is the
first step to separate raw feature data from node labels. None of the
feature sources implement this interface, yet.
This patch unifies the data format of "raw" features by dividing them
into three different basic types.
- keys, a set of names without any associated values, e.g. CPUID flags
or loaded kernel modules
- values, a map of key-value pairs, for features with a single value,
e.g. kernel config flags or os version
- instances, a list of instances each of which has multiple attributes
(key-value pairs of their own), e.g. PCI or USB devices
The new feature data types are defined in a new "pkg/api/feature"
package, catering decoupling and re-usability of code e.g. within future
extentions of the NFD gRPC API.
Rename the Discover() method of LabelSource interface to GetLabels().
Implement new registration infrastructure under the "source" package.
This change loosens the coupling between label sources and the
nfd-worker, making it easier to refactor and move the code around.
Also, create a separate interface (ConfigurableSource) for configurable
feature sources in order to eliminate boilerplate code.
Add safety checks to the sources that they actually implement the
interfaces they should.
In sake of consistency and predictability (of behavior) change all
methods of the sources to use pointer receivers.
Add simple unit tests for the new functionality and include source/...
into make test target.
Drop confusing errors in the log when intel pstate or cstate driver is
not enabled in the system. However, we still log an error if sysfs is
not available at all, in which case we're not able to detect these
correctly.
Get rid of distracting error in the log in case selinux is not enabled
in the kernel. Still print an error only if sysfs/fs directory is not
available, though, which indicates that we're not able to correctly
detect the presence of selinux.
The naming was changed in when with cpuid v2
(github.com/klauspost/cpuid/v2) and we didn't catch this in NFD. No
issue reports of the inadvertent naming change so let's just adapt to
the updated naming in NFD configuration. The SSE4* labels are disabled
by default so they're not widely used, if at all.
In my homelab, I have different FTDI serial converters connected to
several utility meters. They all have identical vendor/device, but
different serials.
In order to detect a specific FTDI unit (eg. the one connected to my
electricity meter), I'd like feature labels triggered by a specific USB
serial.
Signed-off-by: Jorik Jonker <jorik@kippendief.biz>
Mount /usr/lib and /usr/src as /host-usr/lib and /host-usr/src inside the pod
to allow NFD to search for the kernel configuration file inside /usr.
This solves the problem of the kernel config file not being present in /boot
on s390x RHCOS.
Signed-off-by: Jan Schintag <jan.schintag@de.ibm.com>
This adds additional cpu features:
- pstate status from status of intel_pstate driver
- pstate scaling settings from scaling_governor
- cstate enable from max_cstates in intel_idle driver
The code should be stable enough. If there are fatal bugs causing the
discovery to panic/segfault that should be made visible instead of
semi-siently hiding it. Also, this caused one (negative) test case to
fail undetected which is now fixed.
There are cases when the only available metadata for discovering
features is the node's name. The "nodename" rule extends the custom
source and matches when the node's name matches one of the given
nodename regexp patterns.
It is also possible now to set an optional "value" on custom rules,
which overrides the default "true" label value in case the rule matches.
In order to allow more dynamic configurations without having to modify
the complete worker configuration, custom rules are additionally read
from a "custom.d" directory now. Typically that directory will be filled
by mounting one or more ConfigMaps.
Signed-off-by: Marc Sluiter <msluiter@redhat.com>
Trim illegal characters from the beginning and end of the kernel version
string. Label values must start and end with an alphanumeric and we want
to have some 'version.full' label, even if a sanitized one.
Skip to the next rule instead of returning immediately on
rule errors. For instance, if the static rules failed, user
provided rules would never be processed.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
This provides support for 32-bit ARM cpuid capabilities based on
the HWCAP flags, and enables the build of NFD on the 32-bit ARM
userland - notably, this also applies to ARM64 systems that are
running userspace in Aarch32 mode, which is where this problem
was first encountered.
Signed-off-by: Paul Mundt <paul.mundt@adaptant.io>
Instead of relying on golang "net" package, use the configured host
sysfs for all discovery. No need to use hostNetwork after that so drop
it from the worker deployment templates.
Extend the FeatureSource interface with new methods for configuration
handling. This enables easier on-the fly reconfiguration of the
feature sources. Further, it simplifies adding config support to feature
sources in the future. Stub methods are added to sources that do not
currently have any configurability.
The patch fixes some (corner) cases with the overrides (--options)
handling, too:
- Overrides were not applied if config file was missing or its parsing
failed
- Overrides for a certain source did not have effect if an empty config
for the source was specified in the config file. This was caused by
the first pass of parsing (config file) setting a nil pointer to the
source-specific config, effectively detaching it from the main config.
The second pass would then create a new instance of the source
specific config, but, this was not visible in the feature source, of
course.
This builds on the PCI support to enable the discovery of USB devices.
This is primarily intended to be used for the discovery of Edge-based
heterogeneous accelerators that are connected via USB, such as the Coral
USB Accelerator and the Intel NCS2 - our main motivation for adding this
capability to NFD, and as part of our work in the SODALITE H2020
project.
USB devices may define their base class at either the device or
interface levels. In the case where no device class is set, the
per-device interfaces are enumerated instead. USB devices may
furthermore have multiple interfaces, which may or may not use the
identical class across each interface. We therefore report device
existence for each unique class definition to enable more fine-grained
labelling and node selection.
The default labelling format includes the class, vendor and device
(product) IDs, as follows:
feature.node.kubernetes.io/usb-fe_1a6e_089a.present=true
As with PCI, a subset of device classes are whitelisted for matching.
By default, there are only a subset of device classes under which
accelerators tend to be mapped, which is used as the basis for
the whitelist. These are:
- Video
- Miscellaneous
- Application Specific
- Vendor Specific
For those interested in matching other classes, this may be extended
by using the UsbId rule provided through the custom source. A full
list of class codes is provided by the USB-IF at:
https://www.usb.org/defined-class-codes
For the moment, owing to a lack of a demonstrable use case, neither
the subclass nor the protocol information are exposed. If this
becomes necessary, support for these attributes can be trivially
added.
Signed-off-by: Paul Mundt <paul.mundt@adaptant.io>
Some Kernel versions include symbols such as "+".
Yocto L4T kernel is an example of this behaviour.
To fix this error all unknown symbols are replaced by an underscore.
Signed-off-by: Pablo Rodriguez <paroque28@gmail.com>
- Implement the 'custom' feature source utilizing the
match rules implemented in previous commit.
- Add a static custom feature list for:
1. rdma.capable - marks a node where devices that support
RDMA are present.
2. rdma.enabled - marks a node where rdma modules have
been loaded.
A user may extend these features with additional match rules via
NFD configuration file.
- Add a Rule interface to help describe the contract
between a match rule and the Custom source that uses it.
- Add PciIdRule - a rule that matches on the PCI attributes:
class, vendor, device. Each is provided as a list of elements(strings).
Match operation: OR will be performed per element and AND will be
performed per attribute.
An empty attribute will not be included in the matching process.
Example:
{
"class": ["0200"]
"vendor": ["15b3"]
"device": ["1014", "1016"]
}
- Add LoadedKmodRule - a rule that matches a list of kernel
modules with the kernel modules currently loaded in the node.
Example:
{
["rdma_cm", "ib_core"]
}
This will enable code reuse across sources while preventing
packages which are not under 'source' to import it.
subsequent commits will introduce the 'custom' source which
will use the logic.
SR-IOV is a PCI attribute and also non-NIC PCI devices can have it. Therefore,
it is useful to label all PCI devices with that capability.
After this commit the following labels for Intel NICs are overlapping:
feature.node.kubernetes.io/pci-0200_8086.sriov.capable=true
feature.node.kubernetes.io/network-sriov.capable=true
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Some workloads may benefit from Intel Turbo Boost technology being
disabled. This patch sets the
'feature.node.kubernetes.io/cpu-pstate.turbo' label to 'false' if we can
detect that it has been disabled. If detection fails no label is
published.
Extend NVDIMM (non-volatile DIMM) discovery by adding detection of DAX
mode, i.e. detection of regions in DAX/AppDirect mode.
The new label is:
feature.node.kubernetes.io/memory-nv.dax: true
Add 'cpuid/attributeBlacklist' and 'cpuid/attributeWhitelist' config
options for the cpu feature source. These can be used to filter the set
of cpuid capabilities that get published. The intention is to reduce
clutter in the NFD label space, getting rid of "obvious" or misleading
cpuid labels. Whitelisting has higher priority, i.e. only whitelist
takes effect if both attributeWhitelist and attributeBlacklist are
specified.
Remove 'cpuid', 'pstate' and 'rdt' feature sources and move their
functionality under the 'cpu' source. The goal is to have a more
systematic organization of feature sources and labels. After this change
we now basically have one source per type of hw, one for kernel and one
for userspace sw.
Related feature labels are changed, correspondingly, new labels being:
feature.node.k8s.io/cpu-cpuid.<cpuid flag>
feature.node.k8s.io/cpu-pstate.turbo
feature.node.k8s.io/cpu-rdt.<rdt feature>
Detect of the Intel SST-BF (Speed Select Technology - Base Frequency)
has been enabled.
Adds one new feature label:
feature.node.kubernetes.io/cpu-power.sst_bf.enabled=true
Based on a patch from kuralamudhan.ramakrishnan@intel.com
Add a new Makefile target for regenerating these files. Also, add a
note that the files are auto-generated, including instructions how to
re-generate them.
Renames the mock files, using the defaults provided by the mockery tool,
in order to make their generation easier.
The aim here is to add another way to specify labels using the local
source by reading files in a specific directory. That avoids us to
execute a hook when we just need to get the content of a file.
See https://github.com/kubernetes-sigs/node-feature-discovery/issues/226
Signed-off-by: Jordan Jacobelli <jjacobelli@nvidia.com>
Get rid of the dependency on intel-cmt-cat library and rdt helper
binaries written in C. Significantly simplifies the build procedure.
Implements minimal support (in assembler) for getting the raw data from
the CPUID instruction. Also, implement a stub so that the code works on
other architectures than amd64, too.
Discover other than bool or tristate kconfig options, too. For bool and
tristate the node label is still binary (i.e. set to "true" if the
kconfig option has been enabled). For other kconfig types (e.g. string
or int) the value of the label directly corresponds to the value of the
kconfig flag, e.g. "32", "elf64-x86-64" etc.
Add two new attributes 'VERSION_ID.major' and 'VERSION_ID.minor' to the
os_release feature. These represent the first two components of
the OS version (version components are assumed to be separated by a
dot). E.g. if VERSION_ID would be 1.2.rc3 major and minor versions would
be 1 and 2, respectively:
feature.node.kubernetes.io/system-os_release.VERSION_ID=1.2.rc3
feature.node.kubernetes.io/system-os_release.VERSION_ID.major=1
feature.node.kubernetes.io/system-os_release.VERSION_ID.minor=2
The version components must be purely numerical in order for them to be
advertised. This way they can be fully (and reliably) utilized in
nodeAffinity, including relative (Gt and Lt) operators.
Remove the 'selinux' feature source and move the functionality under the
'kernel' feature source. The selinux feature label is changed to
feature.node.kubernetes.io/selinux.enabled
The selinux feature source was rather narrow in scope, and, the sole
feature it advertised naturally falls under the kernel feature source.
Currently, it only detects one feature, i.e. hardware multithreading
(such as Intel hyper-threading technology). The corresponding feature
label is:
feature.node.kubernetes.io/cpu-hardware_multithreading=true
However, this (architecture/platform dependent) feature is not detected
directly, and, the heuristics can be mislead. Detection works by
checking the thread siblings of each logical (and online) cpu in the
system. If any cpu has any thread siblings the feature label is set to
true. Thus, hardware multithreading could be effectively disabled e.g.
by putting all sibling cpus offline (even if the technology would be
enabled in hardware).
Implement new 'system' feature source. It now detects OS release
information from the os-release file, assumed to be available at
/host-etc/os-release. It currently creates two labels (assuming that the
corresponding fields are found in the os-release file), with example
values:
feature.node.kubernetes.io/system-os_release.ID=opensuse
feature.node.kubernetes.io/system-os_release.VERSION_ID=42.3
Also, update the template spec to mount /etc/os-release file from the
host inside the container.
This implementation only detects kconfig options ("NO_HZ", "NO_HZ_IDLE",
"NO_HZ_FULL" and "PREEMPT"). The corresponding node labels will be
node.alpha.kubernetes-incubator.io/nfd-kernel-config.<option name>
Currently, only bool and tristate (i.e. '=y' or '=m') kernel config
options are supported. Other kconfig types (e.g. string or int) are
simply ignored. If the kconfig flag is set to '=y' or '=m', the
corresponding node label will be present and it's value will be 'true'.
Make it possible for the hooks to fully define the label name to be used
(i.e. without the '<hook name>-' prefix) by prefixing the printed
feature names with a slash ('/'). This makes it possible to e.g.
override labels create by other sources.
For example having the following output from a hook:
/override_source-override_bool
/override_source-override_value=my value
will translate into the following feature labels:
feature.node.kubernetes.io/override_source-override_bool = true
feature.node.kubernetes.io/override_source-override_value = my value
Make the feature detector hooks, run by the 'local' feature source,
support non-binary label values. Hooks can advertise non-binary value by
using <name>=<value> format.
For example, /etc/kubernetes/node-feature-discovery/source.d/myhook
having the following stdout:
LABEL_1
LABEL_2=foobar
Would translate into the following labels:
feature.node.kubernetes.io/myhook-LABEL_1 = true
feature.node.kubernetes.io/myhook-LABEL_2 = foobar
Implement a new feature source named 'local' whose only purpose is to
run feature source hooks found under
/etc/kubernetes/node-feature-discovery/source.d/ It tries to execute all
files found under the directory, in alphabetical order.
This feature source provides users a mechanism to implement custom
feature sources in a pluggable way, without modifying nfd source code or
Docker images.
The hooks are supposed to print all discovered features in stdout, one
feature per line. The output in stdout is used in the node label as is.
Full node label name will have the following format:
feature.node.kubernetes.io/<hook name>-<feature name>
Stderr from the hooks is propagated to nfd log.
Add a new 'kernel' feature source, detecting the kernel version. The
kernel version is split into multiple labels in order to make this more
usable in label selectors. Kernel version in the format X.Y.Z-patch will
be presented as
node.alpha.kubernetes-incubator.io/nfd-kernel-version.full=X.Y.Z-patch
node.alpha.kubernetes-incubator.io/nfd-kernel-version.major=X
node.alpha.kubernetes-incubator.io/nfd-kernel-version.minor=Y
node.alpha.kubernetes-incubator.io/nfd-kernel-version.revision=Z
The '.full' label will always be avaiable. The other labels if these
components can be parsed from the kernel version number.
Add new config option for specifying the device label, i.e. the
<device-label> part in
node.alpha.kubernetes-incubator.io/nfd-pci-<device label>.present
The option is a list of field names:
"class" PCI device class
"vendor" Vendor ID
"device" Device ID
"subsystem_vendor" Subsystem vendor ID
"subsystem_device" Subsystem device ID
E.g. the following command line flag can be used to use all of the
above:
--options='{"sources": {"pci": {"deviceLabelFields": ["class", "vendor", "device", "subsystem_vendor", "subsystem_device"] } } }'
User can now configure the list of device classes to detect, either via
a configuration file or by using the --options command line flag.
An example of a command line flag to detect all network controllers and
("main class 0x02) and VGA display controllers ("main" class 0x03 and
subclass 0x00) would be:
--options='{"sources": {"pci": {"deviceClassWhitelist": ["02", "0300"] } } }'
This feature source detects the presence of PCI devices. At the moment,
it only advertises GPUs and accelerator cards, i.e. device classes 0x03,
0x0b40 and 0x12.
The label format is:
node.alpha.kubernetes-incubator.io/nfd-pci-<device label>.present
where <device label> is composed of raw PCI IDs:
<class id>_<vendor id>
Introduce a new scheme where features may have logical sub-components.
Rename sriov labels from the network source according to the new
pattern:
sriov -> sriov.capable
sriov-configured -> sriov.configured
Also, document this new labeling scheme in the README.
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
* Arrange feature sources alphabetically
Just a cosmetic change, but, a small readability improvement.
* Implement detection of IOMMU (#136)
Add a new feature source, i.e. 'iommu', which detects if an IOMMU is
present and enabled in the kernel. The new node label is
node.alpha.kubernetes-incubator.io/nfd-iommu-present
This commit:
- enables node-feature-dicovery to advertise selinux status
on the node by adding a label.
- update the template to mount /sys into the container, this is
needed to know about the selinux status on the host
- adds selinux source for unit tests
This change adds feature source "storage".
Add label if any non-rotational block device is present in the node.
The label will be: nfd-storage-nonrotationaldisk=true
* Make rdt-discovery buildable outside hardcoded path
Do not assume that nfd sources always reside under hardcoded directory
"/go/src/github.com/kubernetes-incubator/node-feature-discovery/". This
makes it possible e.g. to build nfd locally outside the Docker
container.
* Do not hardcode the path for RDT helper binaries
Utilize the standard PATH env variable, instead.
* Format main,fake,network go source to improve GoReportCard
Order of import changed in main.go and network.go,
missing ending newline added in fake.go
* Fixed simple golint suggestions in multiple files
Mostly about missing or incomplete comments
* Adding SR-IOV capability discovery to node-feature-discovery
* SR-IOV capability discovery in NFD : code update after PR review
- using hostnetwork instead of volume mount in file node-feature-discovery-job.json.template
- iterating through network interfaces that are "up" in sources.go
- inserting logs in sources.go
- change in feature source name from "netid" to "network" in sources.go, README.md and main.go
* Added code for labels sriov=true (sriov_totalvfs > 0) and sriov-configured=true (sriov_numvfs > 0)
* Code Refactored: Added Network package and network.go