Prior to this feature, NFD consisted of only software components namely
nfd-master and nfd-worker. We have introduced another software component
called nfd-topology-updater.
NFD-Topology-Updater is a daemon responsible for examining allocated resources
on a worker node to account for allocatable resources on a per-zone basis (where
a zone can be a NUMA node). It then communicates the information to nfd-master
which does the CRD creation corresponding to all the nodes in the cluster. One
instance of nfd-topology-updater is supposed to be running on each node of the
cluster.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
* Simplify NFD worker service configuration in Helm
Signed-off-by: Elias Koromilas <elias.koromilas@gmail.com>
* Update docs/get-started/deployment-and-usage.md
Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
Drop --sleep-interval from the template. We really don't want to do that
as. First, it's the default value so no use repeating that in the
template. And more importantly, the commandline flag will override
anything that will be provided in the worker config file, making it
impossible for users to specify the sleep interval (other than by
editing the template directly).
We should use the same flag set for both program and klog arguments.
Otherwise we won't be able to provide klog flags properly
Signed-off-by: Talor Itzhak <titzhak@redhat.com>
The base should really have the very bare minimum. Remove all redundant
(at default-value) args and move the others to the specific
topologyupdater kustomize component. This also makes these settings
re-usable in user-specific overlays (that are not based on
topologyupdater-daemonset).
Make table of contents in the pages cleaner and more readable by
dropping the main heading (H1 level) from TOCs. This was the original
intention with the usage of "no_toc" kramdown magic, which was broken,
however. The kramdown class magic needs to be specified on the line
immediately following the headinds, otherwise it has no effect. We need
to disable MD022 rule of mdlint as it does not understand this magic.
Align "topologyupdater" overlay with "topologyupdater-job". Both should
deploy topologyupdater as a standalone application. Previously the
topologyupdater overlay did not deploy nfd-master at all (but deployed
nfd-worker instead) causing the pods to end up in crashloopbackoff as
there was no master to communicate with.
Use 'go generate' for auto-generating code. Drop the old 'mock' and
'apigen' makefile targets. Those are replaced with a single
make generate
which (re-)generates everything.
- create an overlay for deployment of all components
- create an overlay for just topologyupdater deployment (to be deployed in
conjunction with the default overlay)
- create a separate overlay for deployment of master and topologyupdater-job
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
There have been recent changes made to the noderesourcetopology API
storing the proto file generated using go-to-protobuf tool and
this code inports the proto generated in the API in the topology-updater.proto
The PRs corresponding to the changes are as follows:
https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/9https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/13
Commands used to generate topology-updater.pb.go file:
go install github.com/golang/protobuf/protoc-gen-go@v1.4.3
go mod vendor
protoc --go_opt=paths=source_relative --go_out=plugins=grpc:. pkg/topologyupdater/topology-updater.proto -I. -Ivendor
As part of implmentation of this patch, reserved (non-allocatable) CPUs
are evaluated by performing a difference between all the CPUs on a system
(determined by using ghw) and allocatable CPUs (determined by querying
GetAllocatableResources podResource API endpoint).
When aggregator creates the NUMA zones, it will skip the zone creation if
there are no allocatable resources. In this update we creates those missing
zone with zero allocatable/available resources so we won't have holes in the
array of reported zones.
Co-Authored-by: Talor Itzhak <titzhak@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
For accounting we should consider all guaranteed pods with
integral CPU requests and all the pods with device requests
This patch ensures that pods are only considered
for accounting disregarding non-guranteed pods without any
device request.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
- Files obtained after running make mock
- Run `go get github.com/vektra/mockery` and make sure that
mockery is in your $PATH
- run `make mock`
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
- This patch allows to expose Resource Hardware Topology information
through CRDs in Node Feature Discovery.
- In order to do this we introduce another software component called
nfd-topology-updater in addition to the already existing software
components nfd-master and nfd-worker.
- nfd-master was enhanced to communicate with nfd-topology-updater
over gRPC followed by creation of CRs corresponding to the nodes
in the cluster exposing resource hardware topology information
of that node.
- Pin kubernetes dependency to one that include pod resource implementation
- This code is responsible for obtaining hardware information from the system
as well as pod resource information from the Pod Resource API in order to
determine the allocatable resource information for each NUMA zone. This
information along with Costs for NUMA zones (obtained by reading NUMA distances)
is gathered by nfd-topology-updater running on all the nodes
of the cluster and propagate NUMA zone costs to master in order to populate
that information in the CRs corresponding to the nodes.
- We use GHW facilities for obtaining system information like CPUs, topology,
NUMA distances etc.
- This also includes updates made to Makefile and Dockerfile and Manifests for
deploying nfd-topology-updater.
- This patch includes unit tests
- As part of the Topology Aware Scheduling work, this patch captures
the configured Topology manager scope in addition to the Topology manager policy.
Based on the value of both attribues a single string will be populated to the CRD.
The string value will be on of the following {SingleNUMANodeContainerLevel,
SingleNUMANodePodLevel, BestEffort, Restricted, None}
Co-Authored-by: Artyom Lukianov <alukiano@redhat.com>
Co-Authored-by: Francesco Romani <fromani@redhat.com>
Co-Authored-by: Talor Itzhak <titzhak@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Setup the topologyupdater API for gRPC communication of
nfd-topology-updater with master
We generate pb.go file to reflect latest dependency changes
using github.com/golang/protobuf/protoc-gen-go and generate
grpc files via:
`protoc pkg/topologyupdater/topology-updater.proto --go_out=plugins=grpc:.`
Please refer to: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/blob/master/pkg/apis/topology/v1alpha1/types.go
Co-Authored-by: Artyom Lukianov <alukiano@redhat.com>
Co-Authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>