1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00
Commit graph

136 commits

Author SHA1 Message Date
Markus Lehtonen
a9849f20ff nfd-master: fix retry of node updates
This patch addresses issues with slow node status (extended resources)
updates. Previously we did just a few retries in quick succession which
could result in the node update failing, just because node status was
updated slower than our retry window. The patch mitigates the issue by
increasing the number of tries to 15. In addition, it creates a
ratelimiter with a longer per-item (per-node) base delay.

The patch also fixes the e2e-tests to expose the issue.
2023-10-20 17:24:01 +03:00
Kubernetes Prow Robot
b6231b60fc
Merge pull request #1418 from ArangoGutierrez/test-utils-deplo
Fix pkg name for test/utils/deployment
2023-10-20 13:44:32 +02:00
Markus Lehtonen
d7a91b818e test/e2e: fix source/custom nodename test
We dropped the legacy rule format so we need to convert the e2e test
rules to the new format, accordingly.
2023-10-20 12:12:45 +03:00
Carlos Eduardo Arango Gutierrez
251f0d8a7e
Fix pkg name for test/utils/deployment
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-10-16 16:11:20 +02:00
Markus Lehtonen
1d8a83b045 nfd-master: stop creating NFD version annotations
We now have metrics for getting detailed information about the NFD
instances running. There should be no need to pollute the node object
with NFD version annotations.

One problem with the annotations also that they were incomplete in the
sense that they only covered nfd-master and nfd-worker but not
nfd-topology-updater or nfd-gc.

Also, there was a problem with stale annotations, giving misleading
information. E.g. there was no way to remove old/stale master.version
annotations if nfd-master was scheduled on another node where it was
previously running.
2023-10-05 14:53:29 +03:00
Markus Lehtonen
b09ce75c8e nfd-master: fix filtering of extended resources
Fix a bug in checking the allowed ".feature.node.kubernetes.io" ns
suffix for extended resources. Also update e2e-tests to cover this case.
2023-09-27 10:55:11 +03:00
Carlos Eduardo Arango Gutierrez
30b8751515
nfd_gc_test.go: fix multiple import of same pkg
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-09-06 09:47:15 +02:00
Kubernetes Prow Robot
50dd128b23
Merge pull request #1329 from ArangoGutierrez/1187
Enable NodeFeature API by default
2023-09-05 11:56:51 -07:00
Carlos Eduardo Arango Gutierrez
04e954a7c3
Enable NodeFeature API by default
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
2023-09-05 20:21:31 +02:00
Markus Lehtonen
f8162a0106 e2e/test: make the nfd-gc test pass on one-node cluster
Also remove some leftover debug print.
2023-09-05 14:16:50 +03:00
Francesco Romani
000c919071 nfd-updater: events: enable timer-only flow
The nfd-topology-updater has state-directories notification mechanism
enabled by default.
In theory, we can have only timer-based updates, but if the option
is given to disable the state-directories event source, then all
the update mechanism is mistakenly disabled, including the
timer-based updates.

The two updaters mechanism should be decoupled.
So this PR changes this to make sure we can enable just and only
the timer-based updates.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-09-04 13:05:50 +02:00
Markus Lehtonen
f9fadd2102 test/e2e: add e2e test for nfd-gc 2023-08-22 21:24:26 +03:00
Markus Lehtonen
2e79a015f5 test/e2e: align with latest kubernetes code base 2023-08-16 12:43:52 +03:00
guoguangwu
29118f67bb fix: Drop the e2elog instead
Signed-off-by: guoguangwu <guoguangwu@magic-shield.com>
2023-06-25 09:44:08 +08:00
guoguangwu
92482e45d8 node_feature_discovery_test.go rm pkg imported twice
Signed-off-by: guoguangwu <guoguangwu@magic-shield.com>
2023-06-21 16:55:25 +08:00
AhmedGrati
08b9c3486e feat: support dynamic values for labels in the NodeFeatureRule
This PR aims to support the dynamic values for labels in the
NodeFeatureRule CRD, it would offer more flexible labeling for users.
To achieve this, we check whether label value starts with "@", and if
it's the case, we will get the value of the feature value, and update
the value of the label with the feature value.

Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-05-31 23:30:26 +01:00
Muyassarov, Feruzjon
cfb8530083 e2e: delete CRs only if found
Delete NodeFeatureRule and NodeFeature CRs only if found.
Signed-off-by: Muyassarov, Feruzjon <feruzjon.muyassarov@intel.com>
2023-05-08 13:46:29 +03:00
Markus Lehtonen
2d9db2ccec test/e2e: rework taints matching
Add new MatchTaints matcher replacing the old waitForNfdNodeTaints
helper function. Also, drop the now-unused simplePoll() helper function.
2023-05-03 08:44:03 +03:00
Markus Lehtonen
f93ab9d423 test/e2e: rework node capacity matching
Add new MatchCapacity matcher replacing the old waitForCapacity helper
function.
2023-05-03 08:44:03 +03:00
Markus Lehtonen
a85e396200 test/e2e: rework annotations matcher
Add new MatchAnnotations Gomega matcher and drop the old
waitForNfdNodeAnnotations helper function.
2023-05-03 08:44:03 +03:00
Markus Lehtonen
2330896620 test/e2e: refactor matching of node properties
Implement a new generic type nodeListPropertyMatcher, a generic Gomega
matcher for matching basically any property of a set of node objects. We
will be using it for verifying labels, annotations, extended resources
and taints for now. This moves the tests in a more Gomega'ish direction,
leveraging code re-use and providing way more informative error messages
in case of test failures.

The patch adds a new eventuallyNonControlPlaneNodes helper assertion for
asserting all (non-control-plane) nodes in the cluster, intended to
replace the ugly simplePoll() helper function.

This patch implements a matcher for node labels and converts tests to
use it instead of the old checkForNodeLabels helper function.
2023-05-03 08:44:03 +03:00
AhmedGrati
87c2d7e184 nfd-master: fix resync period config option
This PR fixes the resync-period configuration option of the nfd-master.
In fact, previously, changes were not reflected in the nfd-master at
runtime. e2e tests are also implemented to make sure that the fix is
already working as expected.

Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-05-02 13:17:01 +02:00
Markus Lehtonen
87371e2df0 test/e2e: adapt tests to updates in k8s e2e-framework
Add context to functions that now require it. Also, replace the
deprecated wait.Poll* calls with wait.PollUntilContextTimeout.
2023-04-18 23:04:34 +03:00
Markus Lehtonen
ad8bd057b7 test/e2e: use proper context
Eliminate all context.TODO() from the e2e tests and use ginkgo context
instead. This ensures that calls involving context are properly
cancelled and return fast in case the tests get aborted.
2023-04-18 14:55:09 +03:00
Fabiano Fidêncio
250aea4741
Create extended resources with NodeFeatureRule
Add support for management of Extended Resources via the
NodeFeatureRule CRD API.

There are usage scenarios where users want to advertise features
as extended resources instead of labels (or annotations).

This patch enables the discovery of extended resources, via annotation
and patch of node.status.capacity and node.status.allocatable. By using
the NodeFeatureRule API.

Co-authored-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-04-07 16:14:56 +02:00
Markus Lehtonen
cc6c20ff5f nfd-master: disallow unprefixed and kubernetes taints
Disallow taints having a key with "kubernetes.io/" or "*.kubernetes.io/"
prefix. This is a precaution to protect the user from messing up with
the "official" well-known taints from Kubernetes itself. The only
exception is that the "nfd.node.kubernetes.io/" prefix is allowed.

However, there is one allowed NFD-specific namespace (and its
sub-namespaces) i.e. "feature.node.kubernetes.io" under the
kubernetes.io domain that can be used for NFD-managed taints.

Also disallow unprefixed taint keys. We don't add a default prefix to
unprefixed taints (like we do for labels) from NodeFeatureRules. This is
to prevent unpleasant surprises to users that need to manage matching
tolerations for their workloads.
2023-04-06 16:12:37 +03:00
Markus Lehtonen
2e85b8a914 test/e2e: refactor nfd pod configuration
Make the default master pod run with no special options. Move the
customizations of the master pod to the setup functions of the tests
that actually need it.

Also, cleanup the configuration of nfd-worker of some tests.
2023-04-05 21:51:25 +03:00
Kubernetes Prow Robot
60f052f086
Merge pull request #1116 from marquiz/devel/e2e-crd-deletion
test/e2e: wait for CRD deletion to complete
2023-04-05 07:31:40 -07:00
Markus Lehtonen
5793207cf2 test/e2e: wait for CRD deletion to complete
Wait for the deletion of NFD CRDs to complete before trying to re-create
them. Prevents errors in case CRDs already exist on the cluster when
e2e-tests are launched.
2023-04-05 15:56:26 +03:00
Markus Lehtonen
68c3bf317b test/e2e: fix node cleanup function
The node cleanup function was not removing all NFD-labels. It omitted
NFD-originated labels that used a non-default label namespace. This
patch fixes the issue by getting all NFD-managed labels from the special
annotation (nfd.node.kubernetes.io/feature-labels).

The patch also adds the ability to cleanup extended resources in a
similar way. This will be needed by future work.

Also changes the order of cleaning up CRs and the node. It is the right
order as cleaning up the CRs may still update the node.
2023-04-05 15:09:25 +03:00
Kubernetes Prow Robot
193c552b33
Merge pull request #1084 from AhmedGrati/feat-add-master-config-file
feat: add master config file
2023-04-04 10:41:40 -07:00
AhmedGrati
3fff409f6d Add master config file
Similar to the nfd-worker, in this PR we want to support the
dynamic run-time configurability through a config file for the nfd-master.

We'll use a json or yaml configuration file along with the fsnotify in
order to watch for changes in the config file. As a result, we're
allowing dynamic control of logging params, allowed namespaces,
extended resources, label whitelisting, and denied namespaces.

Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-04-03 09:52:09 +01:00
Talor Itzhak
6de13fe456 e2e: reactive updates test
Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2023-03-12 12:43:17 +02:00
AhmedGrati
16abfd7b0e test: implement e2e test of the deny-label-ns flag
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-03-10 11:11:36 +01:00
Kubernetes Prow Robot
2b865759fd
Merge pull request #1073 from marquiz/devel/e2e-worker-wait
test/e2e: reduce worker wait-for-ready period to 2s
2023-03-07 04:18:18 -08:00
Markus Lehtonen
66f6ea76dd test/e2e: cleanup NodeFeature objects before/after tests
Make sure that stale NodeFeature objects from previous test case are not
interfering the next one.
2023-03-07 13:24:01 +02:00
Markus Lehtonen
67bb6c2d5f test/e2e: reduce worker wait-for-ready period to 2s
Reduce the wait time of nfd-worker pods to be in ready-state (before
proceeding with tests) from five to two seconds. Make tests faster to
run. Two seconds should be enough for nfd-workers to do their job and
get nodes labeled.
2023-03-07 11:35:42 +02:00
Kubernetes Prow Robot
163a6dc502
Merge pull request #1049 from jlojosnegros/node-signature
topology-updater:compute pod set fingerprint
2023-02-22 02:05:58 -08:00
Jose Luis Ojosnegros Manchón
b65015027f topology-updater: e2e test for podFingerprint 2023-02-22 10:22:50 +01:00
Markus Lehtonen
adf79d5e38 test/e2e: rename ginkgo focus for tests
Make it easier to only run tests for nfd master/worker and skip
topology-updater tests.
2023-02-21 13:37:25 +02:00
Markus Lehtonen
cc57fa6a93 test/e2e: drop deprecated rand.Seed()
Just drop it, bump to golang v1.20 will cause the generator to be
automatically seeded at program startup:

https://pkg.go.dev/math/rand@go1.20#Seed
2023-02-16 19:22:35 +02:00
pprokop
b51e34d84c Modify e2e tests to check if Topology Manager policy and scope are advertise as Attributes
Signed-off-by: pprokop <pprokop@nvidia.com>
2023-02-10 12:03:16 +01:00
Jose Luis Ojosnegros Manchón
2967f3307a nrt-api: move from v1alpha1 to v1alpha2 2023-02-09 12:29:54 +01:00
Talor Itzhak
97ca4deabc e2e: init docker image
The docker image that used during e2e test
composed of repo and tag flags that are
passed to the test itself.

The problem is that the docker image initialized
before the flags are parsed. Hence, it will always contains
the default flags value.
Moving the variable into a separate function, fixing the issue.

Also, moving the global variables to `e2e_test.go` since
it commonly used by all tests.

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2023-01-11 16:44:40 +02:00
Talor Itzhak
d8981f892e e2e: append _test suffix to test files
This PR is a result of conversation started here:
https://github.com/kubernetes-sigs/node-feature-discovery/pull/1028#issuecomment-1378634404

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2023-01-11 14:15:45 +02:00
Markus Lehtonen
099f52ca36 test/e2e: more comprehensive test for NodeFeature objects
Test creation of multiple NodeFeature objects per node, mocking 3rd
party extensions.
2023-01-03 17:50:48 +02:00
Markus Lehtonen
59a2757115 Use single-dash format for nfd cmdline flags
Use the "single-dash" version of nfd command line flags in deployment
files and e2e-tests. No impact in functionality, just aligns with
documentation and other parts of the codebase.
2022-12-21 15:00:49 +02:00
Markus Lehtonen
f5ae3fe2c7 Simplify usage of ObjectMeta fields
No need to explicitly spell out ObjectMeta as it's embedded in the
object types.
2022-12-19 17:40:10 +02:00
Markus Lehtonen
b67d6d7282 test/e2e: add basic e2e-tests for NodeFeature API
Add an initial test set for the NodeFeature API. This is done simply by
running a second pass of the tests but with -enable-nodefeature-api
(i.e. NodeFeature API enabled and gRPC disabled). This should give basic
confidence that the API actually works and form a basis for further
imporovements on testing the new CRD API.
2022-12-19 16:58:21 +02:00
Markus Lehtonen
958db56680 test/e2e: isolate tests into a separate function
Preparation for running the same tests with NodeFeature API enabled
(instead of gRPC).
2022-12-19 14:08:05 +02:00