1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2025-03-06 08:47:04 +00:00
Commit graph

2003 commits

Author SHA1 Message Date
Kubernetes Prow Robot
e8183499d3
Merge pull request #1155 from marquiz/devel/deps
deps: Update kubernetes to v1.27.1
2023-04-18 23:02:45 -07:00
Markus Lehtonen
87371e2df0 test/e2e: adapt tests to updates in k8s e2e-framework
Add context to functions that now require it. Also, replace the
deprecated wait.Poll* calls with wait.PollUntilContextTimeout.
2023-04-18 23:04:34 +03:00
Markus Lehtonen
e2d5ba1a2b pkg/podres: update mocked PodResourcesListerClient
Update mocked implementation of
k8s.io/kubelet/pkg/apis/podresources/v1.PodResourcesListerClient. The
mocked implementation is moved to a separate "mocks" subpackage as it's
for an external interface.

This patch also adds code for auto-generation for the mocked interface.
2023-04-18 20:51:51 +03:00
Markus Lehtonen
ba4b9b3432 go.mod: update kubernetes to v1.27.1 2023-04-18 20:51:51 +03:00
Kubernetes Prow Robot
a6bed7d0cf
Merge pull request #1154 from marquiz/devel/e2e-ctx
test/e2e: use proper context
2023-04-18 10:48:58 -07:00
Kubernetes Prow Robot
82a423b223
Merge pull request #1153 from marquiz/devel/readme
README: update for release v0.13.0
2023-04-18 08:26:59 -07:00
Markus Lehtonen
b53461c09b README: update for release v0.13.0 2023-04-18 14:57:23 +03:00
Markus Lehtonen
ad8bd057b7 test/e2e: use proper context
Eliminate all context.TODO() from the e2e tests and use ginkgo context
instead. This ensures that calls involving context are properly
cancelled and return fast in case the tests get aborted.
2023-04-18 14:55:09 +03:00
Kubernetes Prow Robot
8592f3ea8d
Merge pull request #1151 from marquiz/devel/hack
hack/prepare-release.sh: fix name of one e2e test file
2023-04-17 21:58:57 -07:00
Markus Lehtonen
e5d83d031b hack/prepare-release.sh: fix name of one e2e test file 2023-04-17 23:43:49 +03:00
Kubernetes Prow Robot
b0c52fe28f
Merge pull request #1149 from ArangoGutierrez/sev_capacity
cpu: expose the total number of AMD SEV ASID and ES
2023-04-17 13:22:58 -07:00
Carlos Eduardo Arango Gutierrez
05ef5d4e9d
cpu: expose the total number of AMD SEV ASID and ES
This patch add SEV ASIDs and the related (but distinct) SEV Encrypted State
(SEV-ES) IDs as two quantities to be exposed via extended resources.
In a kernel built with CONFIG_CGROUP_MISC on a suitably equipped AMD CPU, the
root control group will have a misc.capacity file that shows the number of
available IDs in each category.

The added extended resources are:
- sev.asids
- sev.encrypted_state_ids

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-04-17 19:34:39 +02:00
Kubernetes Prow Robot
df584e03ed
Merge pull request #1145 from marquiz/devel/grpc-probe
Dockerfile: bump grpc-health-probe to v0.4.18
2023-04-17 05:28:43 -07:00
Markus Lehtonen
ecc242d78a Dockerfile: bump grpc-health-probe to v0.4.18
A new version that was just released.
2023-04-17 14:30:08 +03:00
Kubernetes Prow Robot
ca59fc0594
Merge pull request #1140 from marquiz/devel/owners
OWNERS: add PiotrProkop as a reviewer
2023-04-17 03:22:43 -07:00
Markus Lehtonen
57e21969d0 OWNERS: add PiotrProkop as a reviewer 2023-04-17 12:58:16 +03:00
Kubernetes Prow Robot
018cd33306
Merge pull request #1095 from fmuyassarov/codecov-uploader
e2e: add codecov uploader configuration
2023-04-14 14:30:41 -07:00
Kubernetes Prow Robot
fef5e56051
Merge pull request #1129 from mythi/sgx-epc
cpu: Expose SGX EPC resource
2023-04-14 10:42:41 -07:00
Mikko Ylinen
de1b69a8bf cpu: make SGX EPC resource available to NodeFeatureRules
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-04-14 15:31:54 +03:00
Kubernetes Prow Robot
cb604b877c
Merge pull request #1130 from marquiz/devel/tdx
source/cpu: don't create cpu-security.tdx.total_keys label
2023-04-14 04:18:41 -07:00
Markus Lehtonen
3320c74472 source/cpu: don't create cpu-security.tdx.total_keys label
Just have that as a feature for NodeFeatureRules to consume.
2023-04-14 13:33:13 +03:00
Kubernetes Prow Robot
84c348b69f
Merge pull request #1126 from marquiz/devel/er-deprecation
nfd-master: deprecate the -resource-labels flag
2023-04-13 10:52:39 -07:00
Kubernetes Prow Robot
8d71ed6755
Merge pull request #1086 from AhmedGrati/feat-support-builtin-kernel-mods
feat: support builtin kernel mods
2023-04-13 10:30:40 -07:00
Kubernetes Prow Robot
47acda75c3
Merge pull request #1128 from marquiz/devel/test-timeout
Makefile: set e2e test timeout to 1 hour
2023-04-13 09:24:38 -07:00
Markus Lehtonen
3a1a8d4c6f Makefile: set e2e test timeout to 1 hour
Previously we were using the default, which even if equal to 0, still
means 10 minute timout in practice (with the way we run the tests with
invoking go test directly). With the addition of latest e2e tests we
hit the limit and got bitten by it. Set the timeout to 1 hour which
should be enough for anyone...
2023-04-13 18:57:19 +03:00
Kubernetes Prow Robot
f9cc798057
Merge pull request #1127 from marquiz/devel/nfd-master-retry
nfd-master: re-try on node update failures
2023-04-13 07:14:39 -07:00
Markus Lehtonen
6b2d10753f nfd-master: re-try on node update failures
Change the NFD API handler to re-try on node update failures. Will work
around transient failures, making sure that failed nodes (i.e. nodes
that we failed to update) don't need to wait for the 1 hour resync
period before being tried again.
2023-04-13 16:30:31 +03:00
AhmedGrati
109caa1f28 feat: support builtin kernel mods
This PR adds the combination of dynamic and builtin kernel modules into
one feature called `kernel.enabledmodule`. It's a superset of the
`kernel.loadedmodule` feature.

Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-04-13 10:19:24 +01:00
Markus Lehtonen
8511980bf4 nfd-master: deprecate the -resource-labels flag
Mark the -resource-labels flag (and the corresponding resourceLabels
config option) as deprecated. We now support managing extended resources
via NodeFeatureRule objects. This kludge deserves to go, eventually.
2023-04-13 11:30:58 +03:00
Kubernetes Prow Robot
e75be0b257
Merge pull request #1123 from marquiz/devel/master-resync
nfd-master: increase controller resync period to 1 hour
2023-04-12 08:00:32 -07:00
Markus Lehtonen
70ac19ea66 nfd-master: increase controller resync period to 1 hour
Increase the NFD API controller resync period from 5 minutes to 1 hour.
The resync causes nfd-master to replay all NodeFeature and
NodeFeatureRule objects, being effectively a "big hammer reset all"
button. This should only be needed as an "insurance" to fix labels et al
in case they have been manually tampered (outside NFD) and against
certain bugs in nfd itself. NFD is not supposed to manage anything
fast-changing so 1 hour should be enough.

This change only affects behavior when the NodeFeature API has been
enabled (with -enable-nodefeature-api).
2023-04-12 16:38:47 +03:00
Muyassarov, Feruzjon
6ed43c2926 e2e: add codecov uploader configuration
Signed-off-by: Muyassarov, Feruzjon <feruzjon.muyassarov@intel.com>
2023-04-11 23:33:47 +03:00
Kubernetes Prow Robot
fe1db97132
Merge pull request #1122 from marquiz/devel/er-docs
docs: add missing mentions of extended resources and taints
2023-04-11 10:43:08 -07:00
Markus Lehtonen
dcbb3bc450 docs: add missing mentions of extended resources and taints
A small update to fix some missing mentions of extended resources and
taints as assets managed by NFD.
2023-04-11 20:38:21 +03:00
Kubernetes Prow Robot
2e51840e1d
Merge pull request #1121 from marquiz/devel/grpc-probe
Dockerfile: bump grpc-health-probe to v0.4.17
2023-04-11 01:21:18 -07:00
Markus Lehtonen
af37efec65 Dockerfile: bump grpc-health-probe to v0.4.17
Update to the latest release.
2023-04-11 10:12:18 +03:00
Kubernetes Prow Robot
ad07829d0a
Merge pull request #1099 from ArangoGutierrez/extended_resources_v2
Create extended resources with NodeFeatureRule
2023-04-07 08:09:15 -07:00
Fabiano Fidêncio
250aea4741
Create extended resources with NodeFeatureRule
Add support for management of Extended Resources via the
NodeFeatureRule CRD API.

There are usage scenarios where users want to advertise features
as extended resources instead of labels (or annotations).

This patch enables the discovery of extended resources, via annotation
and patch of node.status.capacity and node.status.allocatable. By using
the NodeFeatureRule API.

Co-authored-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-04-07 16:14:56 +02:00
Kubernetes Prow Robot
6740224a13
Merge pull request #1100 from PiotrProkop/expose-L3-num-closid
Advertise RDT L3 num_closid
2023-04-07 00:49:14 -07:00
Kubernetes Prow Robot
f2569b5694
Merge pull request #1119 from marquiz/devel/fix-nfd-master
nfd-master: fix node update
2023-04-06 12:23:35 -07:00
Markus Lehtonen
f64c23968a nfd-master: fix node update
Update node status before node metadata. This fixes a problem where we
lose track of NFD-managed extended resources in case patching node
status fails. Previously we removed all labels and annotations
(including the one listing our ERs) and only after that updated node
status. If node status update failed we had lost the annotation but
extended resources were still there, leaving them orphaned.
2023-04-06 22:04:35 +03:00
Kubernetes Prow Robot
ec014f118b
Merge pull request #1118 from marquiz/devel/taints
nfd-master: disallow unprefixed and kubernetes taints
2023-04-06 06:59:48 -07:00
Markus Lehtonen
cc6c20ff5f nfd-master: disallow unprefixed and kubernetes taints
Disallow taints having a key with "kubernetes.io/" or "*.kubernetes.io/"
prefix. This is a precaution to protect the user from messing up with
the "official" well-known taints from Kubernetes itself. The only
exception is that the "nfd.node.kubernetes.io/" prefix is allowed.

However, there is one allowed NFD-specific namespace (and its
sub-namespaces) i.e. "feature.node.kubernetes.io" under the
kubernetes.io domain that can be used for NFD-managed taints.

Also disallow unprefixed taint keys. We don't add a default prefix to
unprefixed taints (like we do for labels) from NodeFeatureRules. This is
to prevent unpleasant surprises to users that need to manage matching
tolerations for their workloads.
2023-04-06 16:12:37 +03:00
Kubernetes Prow Robot
621823f556
Merge pull request #1117 from marquiz/devel/e2e-refactor
test/e2e: refactor nfd pod configuration
2023-04-06 05:33:48 -07:00
PiotrProkop
0e78eba40e Advertise RDT L3 num_closid
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2023-04-06 11:22:55 +02:00
Markus Lehtonen
2e85b8a914 test/e2e: refactor nfd pod configuration
Make the default master pod run with no special options. Move the
customizations of the master pod to the setup functions of the tests
that actually need it.

Also, cleanup the configuration of nfd-worker of some tests.
2023-04-05 21:51:25 +03:00
Kubernetes Prow Robot
60f052f086
Merge pull request #1116 from marquiz/devel/e2e-crd-deletion
test/e2e: wait for CRD deletion to complete
2023-04-05 07:31:40 -07:00
Kubernetes Prow Robot
3c0c43b9be
Merge pull request #1114 from marquiz/devel/rdt-deprecate
source/cpu: deprecate cpu-rdt.* labels
2023-04-05 06:21:40 -07:00
Kubernetes Prow Robot
f5121a5bdd
Merge pull request #1115 from marquiz/devel/e2e-fix
test/e2e: fix node cleanup function
2023-04-05 06:09:41 -07:00
Markus Lehtonen
5793207cf2 test/e2e: wait for CRD deletion to complete
Wait for the deletion of NFD CRDs to complete before trying to re-create
them. Prevents errors in case CRDs already exist on the cluster when
e2e-tests are launched.
2023-04-05 15:56:26 +03:00