The NodeFeatureGroup is an NFD-specific custom resource that is designed for
grouping nodes based on their features. NFD-Master watches for NodeFeatureGroup
objects in the cluster and updates the status of the NodeFeatureGroup object
with the list of nodes that match the feature group rules. The NodeFeatureGroup
rules follow the same syntax as the NodeFeatureRule rules.
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
The nfd-topology-updater has state-directories notification mechanism
enabled by default.
In theory, we can have only timer-based updates, but if the option
is given to disable the state-directories event source, then all
the update mechanism is mistakenly disabled, including the
timer-based updates.
The two updaters mechanism should be decoupled.
So this PR changes this to make sure we can enable just and only
the timer-based updates.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Eliminate all context.TODO() from the e2e tests and use ginkgo context
instead. This ensures that calls involving context are properly
cancelled and return fast in case the tests get aborted.
Add support for management of Extended Resources via the
NodeFeatureRule CRD API.
There are usage scenarios where users want to advertise features
as extended resources instead of labels (or annotations).
This patch enables the discovery of extended resources, via annotation
and patch of node.status.capacity and node.status.allocatable. By using
the NodeFeatureRule API.
Co-authored-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Wait for the deletion of NFD CRDs to complete before trying to re-create
them. Prevents errors in case CRDs already exist on the cluster when
e2e-tests are launched.
Similar to the nfd-worker, in this PR we want to support the
dynamic run-time configurability through a config file for the nfd-master.
We'll use a json or yaml configuration file along with the fsnotify in
order to watch for changes in the config file. As a result, we're
allowing dynamic control of logging params, allowed namespaces,
extended resources, label whitelisting, and denied namespaces.
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
Use the "single-dash" version of nfd command line flags in deployment
files and e2e-tests. No impact in functionality, just aligns with
documentation and other parts of the codebase.
Add an initial test set for the NodeFeature API. This is done simply by
running a second pass of the tests but with -enable-nodefeature-api
(i.e. NodeFeature API enabled and gRPC disabled). This should give basic
confidence that the API actually works and form a basis for further
imporovements on testing the new CRD API.
Use RuntimeDefault seccomp profile in nfd worker and topology
updater pod spec similar to nfd master.
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
After introducing NodeFeatureRule we packed two CRD definitions in one
yaml file. Our e2e-tests were not prepared to that and the file itself
was also renamed so it couldn't even be read by the test suite.
With this change the e2e-tests start to create NodeFeatre CRD in the
test cluster, preparing for the addition of e2e-tests for NodeFeature
API.
Drop the gRPC communication to nfd-master and connect to the Kubernetes
API server directly when updating NodeResourceTopology objects.
Topology-updater already has connection to the API server for listing
Pods so this is not that dramatic change. It also simplifies the code
a lot as there is no need for the NFD gRPC client and no need for
managing TLS certs/keys.
This change aligns nfd-topology-updater with the future direction of
nfd-worker where the gRPC API is being dropped and replaced by a
CRD-based API.
This patch also update deployment files and documentation to reflect
this change.
Fixes stricter API check on daemonset pod spec that started to cause e2e
test failures. RestartPolicyNever that we previously set (by defaylt)
isn't compatible with DaemonSets.
The new package should provide pod-related utilities,
hence let's move all the daemonset-related utilities
to their own package as well.
Signed-off-by: Talor Itzhak <titzhak@redhat.com>
By moving those utils in to a seperate package,
we can make the functions names shorter and clearer.
For example, instead of:
```
testutils.NFDWorkerPod(opts...)
testutils.NFDMasterPod(opts...)
testutils.SpecWithContainerImage(...)
```
we'll have:
```
testpod.NFDWorker(opts...)
testpod.NFDMaster(opts...)
testpod.SpecWithContainerImage(...)
```
It will also make the package more isolated and portable.
Signed-off-by: Talor Itzhak <titzhak@redhat.com>
The master pod need these `SecurityContext` configurations
In order to run inside a namespace with restricted policy
Signed-off-by: Talor Itzhak <titzhak@redhat.com>
Change the pod spec generator functions to accept parameterization in
the form of more generic "mutator functions". This makes the addition of
new test specific pod spec customizations a lot cleaner. Plus, hopefully
makes the code a bit more readable as well.
Also, slightly simplify the SpecWithConfigMap() but dropping one
redundant argument.
Inspired by latest contributions by Talor Itzhak (titzhak@redhat.com).
It might take time for the CRD to get deleted
and it might cause some falkiness in the tests.
Now before we create the CRD, we make sure to delete
the old object, wait for it deletion to complete
and only then create a new CRD object.
Signed-off-by: Talor Itzhak <titzhak@redhat.com>
The tested pods have some lax spec wrt security,
hence a restrict podSecurity namespace won't allow running those pods.
In topology-updater tests, the topology-updater pod
needs to run the container as root
so change the namespace podSecurity from restricted to priviliged.
In node-feature-discovery tests, we don't need root access,
so add the required security context configuration.
Signed-off-by: Talor Itzhak <titzhak@redhat.com>
Error strings should not be capitalized (ST1005) & remove the
redundancy from array, slice or map composite literals.
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>