1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00
Commit graph

106 commits

Author SHA1 Message Date
Markus Lehtonen
eb8e29c80a nfd-worker: drop deprecated command line flags
Drop the following flags that were deprecated already in v0.8.0:

-sleep-interval  (replaced by core.sleepInterval config file option)
-label-whitelist (replaced by core.labelWhiteList config file option)
-sources         (replaced by -label-sources flag)
2022-11-23 22:33:51 +02:00
Talor Itzhak
5b0788ced4 topology-updater: introduce exclude-list
The exclude-list allows to filter specific resource accounting
from NRT's objects per node basis.

The CRs created by the topology-updater are used by the scheduler-plugin
as a source of truth for making scheduling decisions.
As such, this feature allows to hide specific information
from the scheduler, which in turn
will affect the scheduling decision.
A common use case is when user would like to perform scheduling
decisions which are based on a specific resource.
In that case, we can exclude all the other resources
which we don't want the scheduler to exemine.

The exclude-list is provided to the topology-updater via a ConfigMap.
Resource type's names specified in the list should match the names
as shown here: https://pkg.go.dev/k8s.io/api/core/v1#ResourceName

This is a resurrection of an old work started here:
https://github.com/kubernetes-sigs/node-feature-discovery/pull/545

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2022-11-21 14:08:25 +02:00
Garrybest
3ec1b94020 get kubelet config from configz
Signed-off-by: Garrybest <garrybest@foxmail.com>
2022-11-08 23:52:35 +08:00
Markus Lehtonen
a00cdc2b61 pkg/utils: move hostpath helpers from source to utils
Refactor the code, moving the hostpath helper functionality to new
"pkg/utils/hostpath" package. This breaks odd-ish dependency
"pkg/utils" -> "source".
2022-10-06 14:28:24 +03:00
Markus Lehtonen
443ed48ea8 nfd-worker: stop using deprecated strings.Title
Make golangci-lint happy.
2022-04-13 10:34:29 +03:00
Markus Lehtonen
58e1461d90 nfd-worker: add -feature-sources command line flag
Allows controlling (enable/disable) the "raw" feature detection.
Especially useful for development and testing.
2021-12-03 09:42:35 +02:00
Markus Lehtonen
8cd58af613 nfd-worker: disable sources more easily
Make it easier to disable single sources by prefixing the source name
with a dash ('-') in the core.sources config option (or -sources cmdline
flag).
2021-12-02 10:36:51 +02:00
Markus Lehtonen
77a6b27583 nfd-worker: introduce -label-sources cmdline flag
Useful for development, testing and debugging.
2021-12-01 17:11:49 +02:00
Markus Lehtonen
ad9c7dfa1e nfd-worker: rename config option 'sources' to 'labelSources'
The goal is to make the name more descriptive. Also keeping in mind a
possible future addition a 'featureSources' option (or similar) for
controlling the feature discovery.
2021-12-01 17:11:49 +02:00
Markus Lehtonen
a57a25f63c Use single-dash format of cmdline flags
Use the single-dash (i.e. '-option' instead of '--option') format
consistently accross log messages and documentation. This is the format
that was mostly used, already, and shown by command line help of the
binaries, for example.
2021-11-25 18:03:54 +02:00
Markus Lehtonen
e8872462dc nfd-master: add -featurerules-controller flag
Add a new command line flag for disabling/enabling the controller for
NodeFeatureRule objects. In practice, disabling the controller disables
all labels generated from rules in NodeFeatureRule objects.
2021-11-22 16:57:42 +02:00
NHM Tanveer Hossain Khan
856dfdd8b4 Remove fatal logging to error based on the feedback 2021-11-19 16:57:21 -05:00
Swati Sehgal
b444ef95a8 NFD-Topology-Updater: Bump NRT API to version v0.0.12
The NodeResourceTopology API has been made cluster
scoped as in the current context a CR corresponds to
a Node and since Node is a cluster scoped resource it
makes sense to make NRT cluster scoped as well.

Ref: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/18
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-11-16 13:28:23 +00:00
Talor Itzhak
674720e922 topology-updater:fix klog initialization
We should use the same flag set for both program and klog arguments.
Otherwise we won't be able to provide klog flags properly

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2021-10-11 21:36:54 +03:00
Swati Sehgal
aa7ae9265c topologyupdater: watch/consider only guaranteed pods for accounting
- Files obtained after running make mock
- Run `go get github.com/vektra/mockery` and make sure that
  mockery is in your $PATH
- run `make mock`

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-09-21 10:48:10 +01:00
Francesco Romani
b4c92e4eed topologyupdater: Bootstrap nfd-topology-updater in NFD
- This patch allows to expose Resource Hardware Topology information
  through CRDs in Node Feature Discovery.
- In order to do this we introduce another software component called
  nfd-topology-updater in addition to the already existing software
  components nfd-master and nfd-worker.
- nfd-master was enhanced to communicate with nfd-topology-updater
  over gRPC followed by creation of CRs corresponding to the nodes
  in the cluster exposing resource hardware topology information
  of that node.
- Pin kubernetes dependency to one that include pod resource implementation
- This code is responsible for obtaining hardware information from the system
  as well as pod resource information from the Pod Resource API in order to
  determine the allocatable resource information for each NUMA zone. This
  information along with Costs for NUMA zones (obtained by reading NUMA distances)
  is gathered by nfd-topology-updater running on all the nodes
  of the cluster and propagate NUMA zone costs to master in order to populate
  that information in the CRs corresponding to the nodes.
- We use GHW facilities for obtaining system information like CPUs, topology,
  NUMA distances etc.
- This also includes updates made to Makefile and Dockerfile and Manifests for
  deploying nfd-topology-updater.
- This patch includes unit tests
- As part of the Topology Aware Scheduling work, this patch captures
  the configured Topology manager scope in addition to the Topology manager policy.
  Based on the value of both attribues a single string will be populated to the CRD.
  The string value will be on of the following {SingleNUMANodeContainerLevel,
  SingleNUMANodePodLevel, BestEffort, Restricted, None}

Co-Authored-by: Artyom Lukianov <alukiano@redhat.com>
Co-Authored-by: Francesco Romani <fromani@redhat.com>
Co-Authored-by: Talor Itzhak <titzhak@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-09-21 10:47:39 +01:00
Markus Lehtonen
112744bc50 nfd-worker: split out gRPC connection handling
Refactor the worker code and split out gRPC client connection handling
into a separate base type. The intent is to promote re-usability of code
for other NFD clients, too.
2021-08-20 15:29:27 +03:00
robertdavidsmith
77bd4e4cf6
Accept client certs based on SAN, not just CN (#514)
* first attempt at SAN-based VerifyNodeName

* Update docs on verify-node-name
2021-04-20 01:44:32 -07:00
Markus Lehtonen
8af3a40ca7 logging: set grpc to use klog for logging 2021-03-05 14:44:44 +02:00
Carlos Eduardo Arango Gutierrez
389a8f87cf
logging: start log messages with lower case
Standarize logs to be lower case.

Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>
2021-03-01 10:07:21 -05:00
Markus Lehtonen
ed722d5527 nfd-worker: change Fatal* log messages to Exit*
We don't want to always print backtraces.
2021-02-25 18:23:47 +02:00
Markus Lehtonen
73bea99c34 nfd-master: drop last remnants of log package 2021-02-25 18:23:12 +02:00
Markus Lehtonen
3f18e880b4 nfd-worker: dynamic configuration of klog
Make it possible to dynamically (at run-time) alter most of the logging
configuration from the config file.
2021-02-25 16:10:43 +02:00
Markus Lehtonen
7da7fde8f6 nfd-worker: switch to klog
Greatly expands logging capabilities and flexibility with verbosity
options, among other things.
2021-02-25 16:10:43 +02:00
Markus Lehtonen
3ffb7b8fc5 nfd-master: switch to klog 2021-02-25 07:50:37 +02:00
Markus Lehtonen
3fd61eacdb nfd-worker: switch to flag in command line parsing 2021-02-24 12:06:16 +02:00
Markus Lehtonen
47033db9c1 nfd-master: use flag for command line parsing 2021-02-24 12:06:16 +02:00
Markus Lehtonen
7e88f00e05 nfd-worker: add core.sources config option
Add a config file option for controlling the enabled feature sources,
aimed at replacing the --sources command line flag which is now marked
as deprecated. The command line flag takes precedence over the config
file option.
2021-02-17 21:36:20 +02:00
Markus Lehtonen
ed177350fc nfd-worker: add core.labelWhiteList config option
Add a config file option for label whitelisting. Deprecate the
--label-whitelist command line flag. Note that the command line flag has
higher priority than the config file option.
2021-02-17 21:35:44 +02:00
Markus Lehtonen
d1d8de944e nfd-worker: add core.sleepInterval config option
Add a new config file option for (dynamically) controlling the sleep
interval. At the same time, deprecate the --sleep-interval command line
flag. The command line flag takes precedence over the config file option.
2021-02-17 21:35:13 +02:00
Markus Lehtonen
e6bdc17d8c nfd-worker: add core config
Allows dynamic (re-)configuration of most nfd-worker options. The goal
is to have most configuration parameters specified in the configuration
file and deprecate most of the command line flags. The priority is
intended to be such that command line flags override whatever is
specified in the configuration file. Thus, specifying something on the
command line effectively disables dynamic configurability of that
parameter.

This patch adds core.noPublish config file option to demonstrate how the
new mechanism is supposed to work. The --no-publish command line flag
takes precedence over this config file option.
2021-02-17 21:35:12 +02:00
Markus Lehtonen
e52ec3480f nfd-master: implement --instance flag
This can be used to help running multiple parallel NFD deployments in
the same cluster. The flag changes the node annotation namespace to
<instance>.nfd.node.kubernetes.io allowing different nfd-master intances
to store metadata in separate annotations.
2021-02-10 13:48:31 +02:00
Markus Lehtonen
bb1e4c60fb nfd-master: use namespaced label and annotation names internally
For historical reasons the labels in the default nfd namespace have been
internally represented without the namespace part. I.e. instead of
"feature.node.kubernetes.io/foo" we just use "foo". NFD worker uses this
representation, too, both internally and over the gRPC requests. The
same scheme has been used for annotations.

This patch changes NFD master to use fully namespaced label and
annotation names internally. This hopefully makes the code a bit more
understandable. It also addresses some corner cases making the handling
of label names consistent, making it possible to use both "truncated"
and fully namespaced names over the gRPC interface (and in the
annotations).
2020-11-24 12:45:06 +02:00
Markus Lehtonen
29cbb2429c nfd-worker: add special handling for --sources=all
A new special value 'all' is a shortcut for enabling all feature
sources. It should be the only name specified -- if any other names are
specified 'all' does not take effect, but, we only enable the listed
feature sources. E.g.
  --sources=all enables all sources, but
  --sources=all,cpu only enables the cpu source

Also, print a warning if unknown sources are specified.
2020-11-20 16:23:53 +02:00
Markus Lehtonen
458dd8dc58 nfd-master: add --kubeconfig flag
Useful with --prune and for development purposes.
2020-09-07 07:51:42 +03:00
Markus Lehtonen
4669770020 nfd-master: implement --prune flag
A new sub-command like flag for cleaning up a cluster. When --prune is
specified nfd-master removes all NFD related labels, annotations and
extended resources from all nodes of the cluster and exits.

This should help undeployment of NFD and be useful for development.
2020-09-07 07:51:42 +03:00
Markus Lehtonen
c24885840c Better document the --label-whitelist flag 2020-05-20 23:19:09 +03:00
Markus Lehtonen
705d17b9f1 cmd: replace deprecated docopt.Parse with ParseArgs 2020-05-20 21:48:06 +03:00
Paul Mundt
c0ea69411b usb: Add support for USB device discovery
This builds on the PCI support to enable the discovery of USB devices.

This is primarily intended to be used for the discovery of Edge-based
heterogeneous accelerators that are connected via USB, such as the Coral
USB Accelerator and the Intel NCS2 - our main motivation for adding this
capability to NFD, and as part of our work in the SODALITE H2020
project.

USB devices may define their base class at either the device or
interface levels. In the case where no device class is set, the
per-device interfaces are enumerated instead. USB devices may
furthermore have multiple interfaces, which may or may not use the
identical class across each interface. We therefore report device
existence for each unique class definition to enable more fine-grained
labelling and node selection.

The default labelling format includes the class, vendor and device
(product) IDs, as follows:

	feature.node.kubernetes.io/usb-fe_1a6e_089a.present=true

As with PCI, a subset of device classes are whitelisted for matching.
By default, there are only a subset of device classes under which
accelerators tend to be mapped, which is used as the basis for
the whitelist. These are:

	- Video
	- Miscellaneous
	- Application Specific
	- Vendor Specific

For those interested in matching other classes, this may be extended
by using the UsbId rule provided through the custom source. A full
list of class codes is provided by the USB-IF at:

	https://www.usb.org/defined-class-codes

For the moment, owing to a lack of a demonstrable use case, neither
the subclass nor the protocol information are exposed. If this
becomes necessary, support for these attributes can be trivially
added.

Signed-off-by: Paul Mundt <paul.mundt@adaptant.io>
2020-05-20 16:18:39 +02:00
Kubernetes Prow Robot
6d1aa73ca1
Merge pull request #298 from marquiz/devel/version
version: allow undefined version
2020-03-24 09:46:48 -07:00
Kubernetes Prow Robot
7c4ff52a3c
Merge pull request #290 from adrianchiris/custom_features
Support custom features
2020-03-24 08:26:48 -07:00
Markus Lehtonen
8c964b9daf version: allow undefined version
Just print a warning instead of exiting with an error if no version has
been specified at build-time. This was pointless and just annoying at
development time when doing builds with go directly.
2020-03-20 07:21:43 +02:00
Ukri Niemimuukko
903a939836 nfd-master: add extended resource support
This adds support for making selected labels extended resources.

Labels which have integer values, can be promoted to Kubernetes extended
resources by listing them to the added command line flag
`--resource-labels`. These labels won't then show in the node label
section, they will appear only as extended resources.

Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2020-03-19 13:19:22 +02:00
Adrian Chiris
192b3d7bdd Add 'custom' feature Source to nfd-worker 2020-03-19 09:32:07 +02:00
Kubernetes Prow Robot
8996be7e09
Merge pull request #231 from Ethyling/change-label-prefix
Allow to change labels namespace
2019-05-14 07:43:13 -07:00
Jordan Jacobelli
40918827f6
Allow to change labels namespace
The aim here is to allow to override the default namespace
of NFD. The allowed namespaces are whitelisted.
See https://github.com/kubernetes-sigs/node-feature-discovery/issues/227

Signed-off-by: Jordan Jacobelli <jjacobelli@nvidia.com>
2019-05-09 13:17:52 -07:00
Markus Lehtonen
655f5c5555 sources: move all cpu related features under the cpu source
Remove 'cpuid', 'pstate' and 'rdt' feature sources and move their
functionality under the 'cpu' source. The goal is to have a more
systematic organization of feature sources and labels. After this change
we now basically have one source per type of hw, one for kernel and one
for userspace sw.

Related feature labels are changed, correspondingly, new labels being:
    feature.node.k8s.io/cpu-cpuid.<cpuid flag>
    feature.node.k8s.io/cpu-pstate.turbo
    feature.node.k8s.io/cpu-rdt.<rdt feature>
2019-05-09 20:18:36 +03:00
Markus Lehtonen
2de0a019a3 Move most of functionality in cmd/ to pkg/
Move most of the code under cmd/nfd-master and cmd/nfd-worker into new
packages pkg/nfd-master and pk/nfd-worker, respectively. Makes extending
unit tests to "main" functions easier.
2019-05-06 16:26:41 +03:00
Markus Lehtonen
c54551f599 Only read NodeName from env once, at startup
Simplifies the code a bit. Also, log NodeName at startup.
2019-04-04 22:40:24 +03:00
Markus Lehtonen
6f73106d01 FIX split 2019-04-04 22:40:24 +03:00
Markus Lehtonen
4c1e892d88 nfd-master: implement --verify-node-name
Make NodeName based authorization of the workers optional (off by
default). This makes it possible for all nfd-worker pods in the cluster
to use one shared secret, making NFD deployment much easier. However,
this also opens a way for nfd-workers to label other nodes (than what it
is running on), too.
2019-04-04 22:40:24 +03:00
Markus Lehtonen
40061e6a78 nfd-worker: add --server-name-override
Command line option for overriding the Common Name (CN) expected from
the nfd-master TLS certificate. This can be especially handy in
testing/development.
2019-04-04 22:40:24 +03:00
Markus Lehtonen
5253d25d99 Add worker (client) authentication
Implement TLS client certificate authentication. It is enabled by
specifying --ca-file, --key-file and --cert-file, on both the nfd-master
and nfd-worker side. When enabled, nfd-master verifies that the client
(worker) presents a valid certificate signed by the root certificate
(--ca-file). In addition, nfd-master does authorization based on the Common Name
(CN) of the client certificate: CN must match the node name specified in
the labeling request. This ensures (assuming that the worker
certificates are correctly deployed) that nfd-worker is only able to label
the node it is running on, i.e. prevents it from labeling other nodes.
2019-04-04 22:40:24 +03:00
Markus Lehtonen
bca194f6e6 Implement TLS server authentication
Add support for TLS authentication. When enabled, nfd-worker verifies
that nfd-master has a valid certificate, i.e. signed by the given root
certificate and its Common Name (CN) matches the DNS name of the
nfd-master service being used. TLS authentication is enabled by
specifying --key-file and --cert-file on nfd-master, and, --ca-file on
nfd-worker.
2019-04-04 22:40:24 +03:00
Markus Lehtonen
f8bc07952f Fix unit tests after master-worker split
Refactor old tests and add tests for new functions. Add 'test' target in
Makefile.
2019-04-04 22:40:24 +03:00
Markus Lehtonen
39be798472 Split NFD into client and server
Refactor NFD into a simple server-client system. Labeling is now done by
a separate 'nfd-master' server. It is a simple service with small
codebase, designed for easy isolation. The feature discovery part is
implemented in a 'nfd-worker' client which sends labeling requests to
nfd-server, thus, requiring no access/permissions to the Kubernetes API
itself.

Client-server communication is implemented by using gRPC. The protocol
currently consists of only one request, i.e. the labeling request.

The spec templates are converted to the new scheme. The nfd-master
server can be deployed using the nfd-master.yaml.template which now also
contains the necessary RBAC configuration. NFD workers can be deployed
by using the nfd-worker-daemonset.yaml.template or
nfd-worker-job.yaml.template (most easily used with the label-nodes.sh
script).

Only nfd-worker currently support config file or options. The (default)
NFD config file is renamed to nfd-worker.conf.
2019-04-04 22:40:24 +03:00