1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00
Commit graph

75 commits

Author SHA1 Message Date
Markus Lehtonen
6c6249a599 nfd-worker: don't log labels returned by sources by default
Reduce default log verbosity. Only print out labels if log verbosity is
1 or higher ('core.klog.v: 1' config file option or '-v 1' on command
line). Also, dump the labels in a reproducible (sorted) format.
2021-03-15 21:42:33 +02:00
Markus Lehtonen
8d67fc1122 pkg/utils: add dump functions
A simple functions for pretty-printing and logging json-marshallable objects.
2021-03-11 07:12:22 +02:00
Markus Lehtonen
8af3a40ca7 logging: set grpc to use klog for logging 2021-03-05 14:44:44 +02:00
Markus Lehtonen
38d493aa67 pkg/utils: fix possible segfault in RegexpVal.Set 2021-03-02 22:46:34 +02:00
Markus Lehtonen
dd7691c486 nfd-worker: improve log messages of config handling 2021-03-02 18:49:58 +02:00
Carlos Eduardo Arango Gutierrez
389a8f87cf
logging: start log messages with lower case
Standarize logs to be lower case.

Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>
2021-03-01 10:07:21 -05:00
Markus Lehtonen
5e6f0779e9 nfd-worker: stop masking crashes in feature discovery
The code should be stable enough. If there are fatal bugs causing the
discovery to panic/segfault that should be made visible instead of
semi-siently hiding it. Also, this caused one (negative) test case to
fail undetected which is now fixed.
2021-03-01 09:14:19 +02:00
Markus Lehtonen
3f18e880b4 nfd-worker: dynamic configuration of klog
Make it possible to dynamically (at run-time) alter most of the logging
configuration from the config file.
2021-02-25 16:10:43 +02:00
Markus Lehtonen
7da7fde8f6 nfd-worker: switch to klog
Greatly expands logging capabilities and flexibility with verbosity
options, among other things.
2021-02-25 16:10:43 +02:00
Markus Lehtonen
3ffb7b8fc5 nfd-master: switch to klog 2021-02-25 07:50:37 +02:00
Markus Lehtonen
3fd61eacdb nfd-worker: switch to flag in command line parsing 2021-02-24 12:06:16 +02:00
Markus Lehtonen
47033db9c1 nfd-master: use flag for command line parsing 2021-02-24 12:06:16 +02:00
Markus Lehtonen
6b744d4179 nfd-worker: extend unit test coverage of config handling
Add test cases for verifying the core config.

Also, add asynchronous tests for basic verification of dynamic config
file updates.
2021-02-17 21:52:25 +02:00
Markus Lehtonen
2b24ed2c18 nfd-worker: implement Stop() method 2021-02-17 21:50:58 +02:00
Markus Lehtonen
278ccdb997 source/fake: make the fake source configurable
Enables more flexible testing.
2021-02-17 21:50:58 +02:00
Markus Lehtonen
c2c9dff724 nfd-worker: bail out on invalid config file
Changes the behaviour so that if the specified configuration file exists
it must be valid. Error out at startup if the config is invalid.
Similarly, exit with an error at runtime if the config file becomes
invalid. Bailing out, instead of just printing an error, was a
deliberate choice in order to make configuration mistakes evident.

Having no configuration file is tolerated, however. If the specified
configuration file does not exists nfd-worker resorts to default
settings.
2021-02-17 21:42:50 +02:00
Markus Lehtonen
7e88f00e05 nfd-worker: add core.sources config option
Add a config file option for controlling the enabled feature sources,
aimed at replacing the --sources command line flag which is now marked
as deprecated. The command line flag takes precedence over the config
file option.
2021-02-17 21:36:20 +02:00
Markus Lehtonen
ed177350fc nfd-worker: add core.labelWhiteList config option
Add a config file option for label whitelisting. Deprecate the
--label-whitelist command line flag. Note that the command line flag has
higher priority than the config file option.
2021-02-17 21:35:44 +02:00
Markus Lehtonen
d1d8de944e nfd-worker: add core.sleepInterval config option
Add a new config file option for (dynamically) controlling the sleep
interval. At the same time, deprecate the --sleep-interval command line
flag. The command line flag takes precedence over the config file option.
2021-02-17 21:35:13 +02:00
Markus Lehtonen
e6bdc17d8c nfd-worker: add core config
Allows dynamic (re-)configuration of most nfd-worker options. The goal
is to have most configuration parameters specified in the configuration
file and deprecate most of the command line flags. The priority is
intended to be such that command line flags override whatever is
specified in the configuration file. Thus, specifying something on the
command line effectively disables dynamic configurability of that
parameter.

This patch adds core.noPublish config file option to demonstrate how the
new mechanism is supposed to work. The --no-publish command line flag
takes precedence over this config file option.
2021-02-17 21:35:12 +02:00
Kubernetes Prow Robot
85bde7f749
Merge pull request #431 from marquiz/devel/master-instance-flag
nfd-master: implement --instance flag
2021-02-11 02:40:15 -08:00
Markus Lehtonen
29910464a0 nfd-worker: always re-label after a re-config event
Always do re-discovery and re-labeling after a configuration file
change. his way the new config comes into effect immediately, even if
the sleep interval is long (or infinite) # Please enter the commit
message for your changes. Lines starting
2021-02-10 22:09:27 +02:00
Markus Lehtonen
b6ff514853 nfd-worker: use fsnotify for watching for config file changes
Add support for detecting configuration file changes via file system
notifications (fsnotify). Watches are added for the whole directory
chain (up to root directory) so that all changes (even directory
renames) affecting the given configuration file path are captured.

Previously dynamic (re-)configuration of nfd-worker was implemented by
(re-)reading the configuration file on every labeling pass. This was
simple and effective, even if a bit wasteful. However, it didn't provide
asynchronous configuration updates that will be required for e.g.
controlling the "sleep-interval" parameter dynamically which will be
implemented by later patches.
2021-02-10 22:09:27 +02:00
Markus Lehtonen
6958a6677f nfd-worker: use timer channel for sleep interval 2021-02-10 22:09:27 +02:00
Markus Lehtonen
e52ec3480f nfd-master: implement --instance flag
This can be used to help running multiple parallel NFD deployments in
the same cluster. The flag changes the node annotation namespace to
<instance>.nfd.node.kubernetes.io allowing different nfd-master intances
to store metadata in separate annotations.
2021-02-10 13:48:31 +02:00
Markus Lehtonen
705687192d nfd-master: make updateNodeFeatures a method of nfdMaster 2021-02-10 13:46:59 +02:00
Markus Lehtonen
cdca6d667a nfd-master: make nodeName non-global 2021-02-10 13:46:59 +02:00
Markus Lehtonen
b146508e64 nfd-master: drop separate labelerServer type
Simplify code by changing nfdMaster to implement LabelerServer interface
by itself.
2021-02-10 13:46:59 +02:00
Markus Lehtonen
76b95b6c55 Replace improper usage of filepath.Join with path.Join
In JSON and kubernetes API object names we want to use slashes instead
of the OS dependent file path separator.
2021-02-10 12:54:31 +02:00
Markus Lehtonen
19b8f2cd3d nfd-master: more detailed unit testing of extended resources 2020-11-24 12:45:06 +02:00
Markus Lehtonen
d17743a0b9 nfd-master: handle label annotations in the same func
Handle both creation and parsing of the "feature-labels" and
"extended-resources" annotations in the function. I think this is more
logical to keep them together.
2020-11-24 12:45:06 +02:00
Markus Lehtonen
95ff300d74 nfd-master: patch node object instead of rewriting it
When updating node labels and annotations use JSON patches instead of
doing a read-modify-write on the whole node object. Patching is already
being used in managing extended resources so some of the existing code
was re-usable.

This patch should mitigate the problem of node update failures caused by
race conditions (a change in the node object between our read and write)
resulting e.g. in errors/restarts in nfd worker pods.
2020-11-24 12:45:06 +02:00
Markus Lehtonen
1ea301d272 nfd-master: change statusOp to a more generalized JSON patch
Generalize and rename 'statusOp' type to a more flexible 'JsonPatch'.
Move it to the apihelper package.
2020-11-24 12:45:06 +02:00
Markus Lehtonen
bb1e4c60fb nfd-master: use namespaced label and annotation names internally
For historical reasons the labels in the default nfd namespace have been
internally represented without the namespace part. I.e. instead of
"feature.node.kubernetes.io/foo" we just use "foo". NFD worker uses this
representation, too, both internally and over the gRPC requests. The
same scheme has been used for annotations.

This patch changes NFD master to use fully namespaced label and
annotation names internally. This hopefully makes the code a bit more
understandable. It also addresses some corner cases making the handling
of label names consistent, making it possible to use both "truncated"
and fully namespaced names over the gRPC interface (and in the
annotations).
2020-11-24 12:45:06 +02:00
Markus Lehtonen
29cbb2429c nfd-worker: add special handling for --sources=all
A new special value 'all' is a shortcut for enabling all feature
sources. It should be the only name specified -- if any other names are
specified 'all' does not take effect, but, we only enable the listed
feature sources. E.g.
  --sources=all enables all sources, but
  --sources=all,cpu only enables the cpu source

Also, print a warning if unknown sources are specified.
2020-11-20 16:23:53 +02:00
Artyom Lukianov
f363ba0e92 Update e2e test to work with updated dependencies
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2020-11-18 13:09:13 +02:00
Markus Lehtonen
458dd8dc58 nfd-master: add --kubeconfig flag
Useful with --prune and for development purposes.
2020-09-07 07:51:42 +03:00
Markus Lehtonen
4669770020 nfd-master: implement --prune flag
A new sub-command like flag for cleaning up a cluster. When --prune is
specified nfd-master removes all NFD related labels, annotations and
extended resources from all nodes of the cluster and exits.

This should help undeployment of NFD and be useful for development.
2020-09-07 07:51:42 +03:00
Markus Lehtonen
6869a99ceb nfd-master: fix one docstring 2020-09-07 07:51:42 +03:00
Markus Lehtonen
9e813a559c nfd-worker: reload config on each re-discovery pass
Dumb re-read/re-parse of the configuration file on every round of
discoery. Probably not the most elegant solution to watch for config
file changes, but, it works and doesn't cost much overhead.
2020-05-21 00:59:39 +03:00
Markus Lehtonen
a2b9df5cd3 nfd-worker: rework configuration handling
Extend the FeatureSource interface with new methods for configuration
handling. This enables easier on-the fly reconfiguration of the
feature sources. Further, it simplifies adding config support to feature
sources in the future. Stub methods are added to sources that do not
currently have any configurability.

The patch fixes some (corner) cases with the overrides (--options)
handling, too:
- Overrides were not applied if config file was missing or its parsing
  failed
- Overrides for a certain source did not have effect if an empty config
  for the source was specified in the config file. This was caused by
  the first pass of parsing (config file) setting a nil pointer to the
  source-specific config, effectively detaching it from the main config.
  The second pass would then create a new instance of the source
  specific config, but, this was not visible in the feature source, of
  course.
2020-05-21 00:59:37 +03:00
Markus Lehtonen
c95ad3198c nfd-worker: refactor handling of enabled sources and labels
Make the list of enabled sources and the label whitelist regexp members
of the nfdWorker instance. Get rid of the not-that-well-defined
configureParameters() function.
2020-05-21 00:48:21 +03:00
Markus Lehtonen
818fc4cc70 nfd-worker: fix --label-whitelist
Unify handling of --label-whitelist in nfd-worker and nfd-master. That is,
in nfd-worker, apply the regexp filter on non-namespaced part of the
label name.

Brief history:
1. Originally the whitelist regexp was applied on the full namespaced
   label name (that would be e.g.
   'feature.node.kubernetes.io/cpu-cpuid.AVX' in the current nfd version)

2. Commit 81752b2d changed the behavior so that the regexp was applied
   on the non-namespaced part (that would be `cpu-cpuid.AVX`)

3. Commit 40918827 added support for custom label namespaces. With this
   change, the label whitelist handling diverged between nfd-worker and
   nfd-master. In nfd-master the whitelist regexp is always applied on
   the non-namespaced label name. However, in nfd-worker the whitelist
   handling is two-fold (and inconsistent): for labels in the standard
   nfd namespace regexp is applied on the non-namespaced part (e.g.
   `cpu-cpuid.AVX`, but, for labels in custom namespaces the regexp is
   applied on the full name (e.g. `example.com/my-feature`).

This patch changes nfd-worker to behave similarly to nfd-master. The
namespace part is now always omitted, which should be easier for the
users to comprehend.

Also, fixes a bug in the label name prefixing so that the name of the
feature source is not prefixed into labels with custom label namespace
(effectively mangling the intended namespace). For example, previously a
'example.com/feature' label from the 'custom' feature source would be
prefixed with the source name, mangling it to
'custom-example.com/feature'.
2020-05-20 23:07:13 +03:00
Markus Lehtonen
a65d05bd9c source/panic_fake: rename module to make lint happy 2020-05-20 21:48:06 +03:00
Markus Lehtonen
853609f721 nfd-master: lint fixes 2020-05-20 21:48:06 +03:00
Markus Lehtonen
523aa894a3 pkg/cpuid: lint fixes 2020-05-20 21:48:06 +03:00
Markus Lehtonen
c7b1d67b6b nfd-worker: drop deprecated grpc.WithTimeout 2020-05-20 21:48:06 +03:00
Markus Lehtonen
91f3ddcc45 nfd-worker: lint fixes 2020-05-20 21:48:06 +03:00
Paul Mundt
c0ea69411b usb: Add support for USB device discovery
This builds on the PCI support to enable the discovery of USB devices.

This is primarily intended to be used for the discovery of Edge-based
heterogeneous accelerators that are connected via USB, such as the Coral
USB Accelerator and the Intel NCS2 - our main motivation for adding this
capability to NFD, and as part of our work in the SODALITE H2020
project.

USB devices may define their base class at either the device or
interface levels. In the case where no device class is set, the
per-device interfaces are enumerated instead. USB devices may
furthermore have multiple interfaces, which may or may not use the
identical class across each interface. We therefore report device
existence for each unique class definition to enable more fine-grained
labelling and node selection.

The default labelling format includes the class, vendor and device
(product) IDs, as follows:

	feature.node.kubernetes.io/usb-fe_1a6e_089a.present=true

As with PCI, a subset of device classes are whitelisted for matching.
By default, there are only a subset of device classes under which
accelerators tend to be mapped, which is used as the basis for
the whitelist. These are:

	- Video
	- Miscellaneous
	- Application Specific
	- Vendor Specific

For those interested in matching other classes, this may be extended
by using the UsbId rule provided through the custom source. A full
list of class codes is provided by the USB-IF at:

	https://www.usb.org/defined-class-codes

For the moment, owing to a lack of a demonstrable use case, neither
the subclass nor the protocol information are exposed. If this
becomes necessary, support for these attributes can be trivially
added.

Signed-off-by: Paul Mundt <paul.mundt@adaptant.io>
2020-05-20 16:18:39 +02:00
Markus Lehtonen
409dc11389 Switch to sigs.k8s.io/yaml
Replace github.com/ghodss/yaml.
2020-04-23 16:54:14 +03:00