1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00
Commit graph

2291 commits

Author SHA1 Message Date
Markus Lehtonen
28132fb274 test/e2e: stricter validation of node annotations
Now that the hard-to-predict version annotations are gone we can do
strict validation of nfd-generated node annotations.
2023-10-20 18:04:18 +03:00
Kubernetes Prow Robot
6144f3db61
Merge pull request #1422 from marquiz/devel/deps
go.mod: update deps
2023-10-20 17:02:46 +02:00
Kubernetes Prow Robot
5a5892a3c4
Merge pull request #1425 from marquiz/devel/node-update-fix
nfd-master: fix retry of node updates
2023-10-20 16:40:52 +02:00
Markus Lehtonen
a9849f20ff nfd-master: fix retry of node updates
This patch addresses issues with slow node status (extended resources)
updates. Previously we did just a few retries in quick succession which
could result in the node update failing, just because node status was
updated slower than our retry window. The patch mitigates the issue by
increasing the number of tries to 15. In addition, it creates a
ratelimiter with a longer per-item (per-node) base delay.

The patch also fixes the e2e-tests to expose the issue.
2023-10-20 17:24:01 +03:00
Kubernetes Prow Robot
b6231b60fc
Merge pull request #1418 from ArangoGutierrez/test-utils-deplo
Fix pkg name for test/utils/deployment
2023-10-20 13:44:32 +02:00
Markus Lehtonen
29b67d024a go.mod: bump kubernetes to v1.28.3 2023-10-20 13:29:21 +03:00
Markus Lehtonen
e83a64a644 go.mod: update deps
The gist of these updates is to update the opentelemetry packages to
versions with the latest security fixes.
2023-10-20 13:24:41 +03:00
Kubernetes Prow Robot
5b3f4cec0a
Merge pull request #1421 from marquiz/devel/fix-e2e
test/e2e: fix source/custom nodename test
2023-10-20 12:16:16 +02:00
Markus Lehtonen
d7a91b818e test/e2e: fix source/custom nodename test
We dropped the legacy rule format so we need to convert the e2e test
rules to the new format, accordingly.
2023-10-20 12:12:45 +03:00
Carlos Eduardo Arango Gutierrez
251f0d8a7e
Fix pkg name for test/utils/deployment
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-10-16 16:11:20 +02:00
Kubernetes Prow Robot
b6419c7964
Merge pull request #1397 from marquiz/devel/custom-legacy-rule-format
source/custom: drop support for the legacy rule format
2023-10-13 11:56:01 +02:00
Kubernetes Prow Robot
e1d17152de
Merge pull request #1413 from marquiz/devel/grafana
examples: add example grafana dashboard
2023-10-12 19:02:24 +02:00
Kubernetes Prow Robot
d4007d619e
Merge pull request #1416 from kubernetes-sigs/dependabot/go_modules/golang.org/x/net-0.17.0
build(deps): bump golang.org/x/net from 0.13.0 to 0.17.0
2023-10-12 09:03:31 +02:00
dependabot[bot]
0a58f0e6d9
build(deps): bump golang.org/x/net from 0.13.0 to 0.17.0
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.13.0 to 0.17.0.
- [Commits](https://github.com/golang/net/compare/v0.13.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-11 23:03:39 +00:00
Kubernetes Prow Robot
6424f19692
Merge pull request #1415 from AhmedGrati/feat-add-parameters-to-diable-master-and-worker
feat: add parameters in helm to disable/enable nfd-master and nfd-worker
2023-10-11 22:31:56 +02:00
AhmedGrati
d27eb0ac6d feat: add parameters in helm to disable/enable nfd-master and nfd-worker
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-10-11 10:50:32 +01:00
Markus Lehtonen
b75de2a283 examples: add example grafana dashboard
Example visualization for all metrics except
nfd_node_update_requests_total which counts the deprecated (and
disabled-by-default) gRPC requests.
2023-10-10 17:47:51 +00:00
Kubernetes Prow Robot
a379fafcaa
Merge pull request #1411 from ArangoGutierrez/postv0142
Update Readme to V0.14.2
2023-10-10 10:45:58 +02:00
Carlos Eduardo Arango Gutierrez
16b935a1a0
Update Readme to V0.14.2
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-10-10 10:38:08 +02:00
Kubernetes Prow Robot
c927bf5ecd
Merge pull request #1407 from marquiz/devel/gc-metrics
nfd-gc: add metrics
2023-10-10 09:06:13 +02:00
Markus Lehtonen
98c3b0750d nfd-gc: add metrics
Implements three metrics for nfd-gc:

- nfd_gc_build_info: version information of nfd-gc.
- nfd_gc_objects_deleted_total: total number of NodeFeature and
  NodeResourceTopology objects deleted by nfd-gc.
- nfd_gc_object_delete_failures_total: number of errors encountered when
  deleting NodeFeature and NodeResourceTopology objects.
2023-10-09 13:39:28 +00:00
Kubernetes Prow Robot
44b26e39e4
Merge pull request #1400 from marquiz/devel/metrics-docs
docs: document nfd_topology_updater_build_info metric
2023-10-09 15:35:21 +02:00
Markus Lehtonen
f0a3581ca3 docs: document nfd_topology_updater_build_info metric 2023-10-09 13:06:36 +00:00
Kubernetes Prow Robot
4d30205767
Merge pull request #1406 from marquiz/docs/metrics
docs: clarify nfd_node_update_requests_total metric
2023-10-09 15:02:43 +02:00
Markus Lehtonen
24574724e2 docs: clarify nfd_node_update_requests_total metric 2023-10-09 12:45:03 +00:00
Kubernetes Prow Robot
3469e91390
Merge pull request #1402 from marquiz/devel/deps
go.mod: bump kubernetes to v1.28.2
2023-10-09 14:06:01 +02:00
Markus Lehtonen
9130b5e7cf go.mod: bump kubernetes to v1.28.2 2023-10-09 11:06:49 +00:00
Kubernetes Prow Robot
300a5f5c74
Merge pull request #1399 from marquiz/devel/gc-metrics
nfd-gc: simplify initialization
2023-10-09 11:45:46 +02:00
Markus Lehtonen
f5c6ce2843 nfd-gc: simplify initialization 2023-10-09 11:48:49 +03:00
Kubernetes Prow Robot
899939b4ed
Merge pull request #1398 from marquiz/devel/metrics
Refactor metrics
2023-10-09 10:42:41 +02:00
Markus Lehtonen
5171ae0f90 Refactor metrics
Move common boilerplate code under pkg/utils.
2023-10-09 10:49:12 +03:00
Kubernetes Prow Robot
83c7096bbe
Merge pull request #1394 from marquiz/devel/annotations
nfd-master: stop creating NFD version annotations
2023-10-05 15:40:11 +02:00
Markus Lehtonen
7d1df87305 source/custom: drop support for the legacy rule format 2023-10-05 16:15:37 +03:00
Markus Lehtonen
1d8a83b045 nfd-master: stop creating NFD version annotations
We now have metrics for getting detailed information about the NFD
instances running. There should be no need to pollute the node object
with NFD version annotations.

One problem with the annotations also that they were incomplete in the
sense that they only covered nfd-master and nfd-worker but not
nfd-topology-updater or nfd-gc.

Also, there was a problem with stale annotations, giving misleading
information. E.g. there was no way to remove old/stale master.version
annotations if nfd-master was scheduled on another node where it was
previously running.
2023-10-05 14:53:29 +03:00
Kubernetes Prow Robot
160c9107f5
Merge pull request #1390 from ArangoGutierrez/devel/g121
Bump to Go 1.21
2023-10-05 13:14:46 +02:00
Kubernetes Prow Robot
7c0913ed7d
Merge pull request #1393 from marquiz/devel/annotations-fix
nfd-master: correctly clean up annotations
2023-10-05 10:45:40 +02:00
Markus Lehtonen
9ea0a1b420 nfd-master: correctly clean up annotations
Delete correct annotations if -instance is specified.
2023-10-05 11:10:06 +03:00
Kubernetes Prow Robot
7f6fd05357
Merge pull request #1392 from shivamerla/fix_gc_serviceaccount
Fix serviceaccount handling for nfd-gc to be consistent with others
2023-10-05 09:44:59 +02:00
Kubernetes Prow Robot
4b3429def5
Merge pull request #1386 from AhmedGrati/feat-support-raw-features
feat: support raw features
2023-10-05 09:00:44 +02:00
Shiva Krishna, Merla
6237b821f6 Fix serviceaccount handling for nfd-gc to be consistent with others
Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>
2023-10-04 15:17:32 -07:00
AhmedGrati
3130898d58 feat: support raw features
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-10-04 22:37:42 +01:00
Carlos Eduardo Arango Gutierrez
73d624def4
Update Makefile
Co-authored-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-10-04 18:34:55 +02:00
Carlos Eduardo Arango Gutierrez
dd8d7f6725
Helm - service to be only deployed when needed (#1389)
* Helm - service to be only deployed when needed

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>

* Update deployment/helm/node-feature-discovery/templates/service.yaml

Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>

---------

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
2023-10-04 16:00:43 +02:00
Carlos Eduardo Arango Gutierrez
3bb687d3d3
Bump to Go 1.21
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-10-04 11:39:32 +02:00
Kubernetes Prow Robot
4ad6491e0b
Merge pull request #1387 from ArangoGutierrez/devel/helm/grpcdepre2
Helm - Move remaining gPRC related flags to conditional
2023-10-03 08:03:24 +02:00
Carlos Eduardo Arango Gutierrez
3543aa22ce
Helm - Move remaining gPRC related flags to conditional
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-10-03 07:32:52 +02:00
Kubernetes Prow Robot
076ed3c057
Merge pull request #1382 from marquiz/devel/api-cleanup
apis/nfd: drop one stale comment line
2023-09-28 06:04:36 -07:00
Markus Lehtonen
dbf00dcda6 apis/nfd: drop one stale comment line
Drop a leftover "docstring" comment that wasn't removed with the type it
refers to.
2023-09-27 14:23:12 +03:00
Kubernetes Prow Robot
02dd550d3b
Merge pull request #1378 from marquiz/devel/fix-er-filtering
nfd-master: fix filtering of extended resources
2023-09-27 01:27:22 -07:00
Markus Lehtonen
b09ce75c8e nfd-master: fix filtering of extended resources
Fix a bug in checking the allowed ".feature.node.kubernetes.io" ns
suffix for extended resources. Also update e2e-tests to cover this case.
2023-09-27 10:55:11 +03:00