1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00
Commit graph

19 commits

Author SHA1 Message Date
pprokop
6d98b6150b Fix Topology Manager policy and scope not being updated properly
NFD is only detecting policy and scope of Topology Manager when NRT object doesn't exist.
This means that topologyManagerScope and topologyManagerPolicy attributes won't be updated
even if kubelet config was changed to use other TopologyManager policy and scope.

Signed-off-by: pprokop <pprokop@nvidia.com>
2023-07-20 16:31:12 +02:00
Markus Lehtonen
6e3b181ab4 topology-updater: migrate to structured logging 2023-05-31 14:43:08 +03:00
Markus Lehtonen
1200fd05c5 topology-updater: use node IP in the default configz URI
Use a separate NODE_ADDRESS environment variable in the default value of
-kubelet-config-uri (instead of NODE_NAME that was previously used).
Also change the kustomize and Helm deployments to set this variable to
node IP address. This should make the default deployment more robust,
making it work in scenarios where node name does not resolve to the node
ip, e.g. nodename != hostname.
2023-05-05 13:29:51 +03:00
Talor Itzhak
8924213d14 topology-updater: make it possible to disable sleep-interval
Especially convenient for testing porpuses and
completely harmless

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2023-03-12 12:43:17 +02:00
Talor Itzhak
7b248ecae2 topology-updater: update CRs when notified
When a message received via the channel,
the main loop updates the `NodeResourceTopology` objects.

The notifier will send a message via the channel if:
1. It reached the sleep timeout.
2. It detected a change in Kubelet state files

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2023-03-12 12:37:24 +02:00
Talor Itzhak
175e0c81aa topology-updater: add kubelet-state-dir flag
On different Kubernetes flavors like OpenShift for exmaple,
the Kubelet state directory path is different. make it configurable
for maximum flexability.

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2023-03-12 12:37:24 +02:00
Jose Luis Ojosnegros Manchón
b340d112a8 topology-updater:compute pod set fingerprint
Add an option to compute the fingerprint of the current pod set on each
node.

Report this new fingerprint using an attribute in NRT object.
2023-02-22 10:22:50 +01:00
pprokop
5484babcb1 Advertise TopologyManger policy and scope as Attributes
Signed-off-by: pprokop <pprokop@nvidia.com>
2023-02-10 12:03:11 +01:00
Markus Lehtonen
0283f68702 topology-updater: move code
Move and rename the Go package. It has nothing to do with NFD gRPC
client anymore so move it out of the nfd-client package.
2022-12-23 11:37:46 +02:00
Markus Lehtonen
aa97105854 Add common utility function for getting node name 2022-12-23 09:50:15 +02:00
Markus Lehtonen
f13ed2d91c nfd-topology-updater: update NodeResourceTopology objects directly
Drop the gRPC communication to nfd-master and connect to the Kubernetes
API server directly when updating NodeResourceTopology objects.
Topology-updater already has connection to the API server for listing
Pods so this is not that dramatic change. It also simplifies the code
a lot as there is no need for the NFD gRPC client and no need for
managing TLS certs/keys.

This change aligns nfd-topology-updater with the future direction of
nfd-worker where the gRPC API is being dropped and replaced by a
CRD-based API.

This patch also update deployment files and documentation to reflect
this change.
2022-12-08 11:03:22 +02:00
Talor Itzhak
5b0788ced4 topology-updater: introduce exclude-list
The exclude-list allows to filter specific resource accounting
from NRT's objects per node basis.

The CRs created by the topology-updater are used by the scheduler-plugin
as a source of truth for making scheduling decisions.
As such, this feature allows to hide specific information
from the scheduler, which in turn
will affect the scheduling decision.
A common use case is when user would like to perform scheduling
decisions which are based on a specific resource.
In that case, we can exclude all the other resources
which we don't want the scheduler to exemine.

The exclude-list is provided to the topology-updater via a ConfigMap.
Resource type's names specified in the list should match the names
as shown here: https://pkg.go.dev/k8s.io/api/core/v1#ResourceName

This is a resurrection of an old work started here:
https://github.com/kubernetes-sigs/node-feature-discovery/pull/545

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2022-11-21 14:08:25 +02:00
Garrybest
3ec1b94020 get kubelet config from configz
Signed-off-by: Garrybest <garrybest@foxmail.com>
2022-11-08 23:52:35 +08:00
Markus Lehtonen
a00cdc2b61 pkg/utils: move hostpath helpers from source to utils
Refactor the code, moving the hostpath helper functionality to new
"pkg/utils/hostpath" package. This breaks odd-ish dependency
"pkg/utils" -> "source".
2022-10-06 14:28:24 +03:00
Markus Lehtonen
a57a25f63c Use single-dash format of cmdline flags
Use the single-dash (i.e. '-option' instead of '--option') format
consistently accross log messages and documentation. This is the format
that was mostly used, already, and shown by command line help of the
binaries, for example.
2021-11-25 18:03:54 +02:00
NHM Tanveer Hossain Khan
856dfdd8b4 Remove fatal logging to error based on the feedback 2021-11-19 16:57:21 -05:00
Talor Itzhak
674720e922 topology-updater:fix klog initialization
We should use the same flag set for both program and klog arguments.
Otherwise we won't be able to provide klog flags properly

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2021-10-11 21:36:54 +03:00
Swati Sehgal
aa7ae9265c topologyupdater: watch/consider only guaranteed pods for accounting
- Files obtained after running make mock
- Run `go get github.com/vektra/mockery` and make sure that
  mockery is in your $PATH
- run `make mock`

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-09-21 10:48:10 +01:00
Francesco Romani
b4c92e4eed topologyupdater: Bootstrap nfd-topology-updater in NFD
- This patch allows to expose Resource Hardware Topology information
  through CRDs in Node Feature Discovery.
- In order to do this we introduce another software component called
  nfd-topology-updater in addition to the already existing software
  components nfd-master and nfd-worker.
- nfd-master was enhanced to communicate with nfd-topology-updater
  over gRPC followed by creation of CRs corresponding to the nodes
  in the cluster exposing resource hardware topology information
  of that node.
- Pin kubernetes dependency to one that include pod resource implementation
- This code is responsible for obtaining hardware information from the system
  as well as pod resource information from the Pod Resource API in order to
  determine the allocatable resource information for each NUMA zone. This
  information along with Costs for NUMA zones (obtained by reading NUMA distances)
  is gathered by nfd-topology-updater running on all the nodes
  of the cluster and propagate NUMA zone costs to master in order to populate
  that information in the CRs corresponding to the nodes.
- We use GHW facilities for obtaining system information like CPUs, topology,
  NUMA distances etc.
- This also includes updates made to Makefile and Dockerfile and Manifests for
  deploying nfd-topology-updater.
- This patch includes unit tests
- As part of the Topology Aware Scheduling work, this patch captures
  the configured Topology manager scope in addition to the Topology manager policy.
  Based on the value of both attribues a single string will be populated to the CRD.
  The string value will be on of the following {SingleNUMANodeContainerLevel,
  SingleNUMANodePodLevel, BestEffort, Restricted, None}

Co-Authored-by: Artyom Lukianov <alukiano@redhat.com>
Co-Authored-by: Francesco Romani <fromani@redhat.com>
Co-Authored-by: Talor Itzhak <titzhak@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-09-21 10:47:39 +01:00