Markus Lehtonen
28132fb274
test/e2e: stricter validation of node annotations
...
Now that the hard-to-predict version annotations are gone we can do
strict validation of nfd-generated node annotations.
2023-10-20 18:04:18 +03:00
Kubernetes Prow Robot
6144f3db61
Merge pull request #1422 from marquiz/devel/deps
...
go.mod: update deps
2023-10-20 17:02:46 +02:00
Kubernetes Prow Robot
5a5892a3c4
Merge pull request #1425 from marquiz/devel/node-update-fix
...
nfd-master: fix retry of node updates
2023-10-20 16:40:52 +02:00
Markus Lehtonen
a9849f20ff
nfd-master: fix retry of node updates
...
This patch addresses issues with slow node status (extended resources)
updates. Previously we did just a few retries in quick succession which
could result in the node update failing, just because node status was
updated slower than our retry window. The patch mitigates the issue by
increasing the number of tries to 15. In addition, it creates a
ratelimiter with a longer per-item (per-node) base delay.
The patch also fixes the e2e-tests to expose the issue.
2023-10-20 17:24:01 +03:00
Kubernetes Prow Robot
b6231b60fc
Merge pull request #1418 from ArangoGutierrez/test-utils-deplo
...
Fix pkg name for test/utils/deployment
2023-10-20 13:44:32 +02:00
Markus Lehtonen
29b67d024a
go.mod: bump kubernetes to v1.28.3
2023-10-20 13:29:21 +03:00
Markus Lehtonen
e83a64a644
go.mod: update deps
...
The gist of these updates is to update the opentelemetry packages to
versions with the latest security fixes.
2023-10-20 13:24:41 +03:00
Kubernetes Prow Robot
5b3f4cec0a
Merge pull request #1421 from marquiz/devel/fix-e2e
...
test/e2e: fix source/custom nodename test
2023-10-20 12:16:16 +02:00
Markus Lehtonen
d7a91b818e
test/e2e: fix source/custom nodename test
...
We dropped the legacy rule format so we need to convert the e2e test
rules to the new format, accordingly.
2023-10-20 12:12:45 +03:00
Carlos Eduardo Arango Gutierrez
251f0d8a7e
Fix pkg name for test/utils/deployment
...
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-10-16 16:11:20 +02:00
Kubernetes Prow Robot
b6419c7964
Merge pull request #1397 from marquiz/devel/custom-legacy-rule-format
...
source/custom: drop support for the legacy rule format
2023-10-13 11:56:01 +02:00
Kubernetes Prow Robot
e1d17152de
Merge pull request #1413 from marquiz/devel/grafana
...
examples: add example grafana dashboard
2023-10-12 19:02:24 +02:00
Kubernetes Prow Robot
d4007d619e
Merge pull request #1416 from kubernetes-sigs/dependabot/go_modules/golang.org/x/net-0.17.0
...
build(deps): bump golang.org/x/net from 0.13.0 to 0.17.0
2023-10-12 09:03:31 +02:00
dependabot[bot]
0a58f0e6d9
build(deps): bump golang.org/x/net from 0.13.0 to 0.17.0
...
Bumps [golang.org/x/net](https://github.com/golang/net ) from 0.13.0 to 0.17.0.
- [Commits](https://github.com/golang/net/compare/v0.13.0...v0.17.0 )
---
updated-dependencies:
- dependency-name: golang.org/x/net
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-10-11 23:03:39 +00:00
Kubernetes Prow Robot
6424f19692
Merge pull request #1415 from AhmedGrati/feat-add-parameters-to-diable-master-and-worker
...
feat: add parameters in helm to disable/enable nfd-master and nfd-worker
2023-10-11 22:31:56 +02:00
AhmedGrati
d27eb0ac6d
feat: add parameters in helm to disable/enable nfd-master and nfd-worker
...
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-10-11 10:50:32 +01:00
Markus Lehtonen
b75de2a283
examples: add example grafana dashboard
...
Example visualization for all metrics except
nfd_node_update_requests_total which counts the deprecated (and
disabled-by-default) gRPC requests.
2023-10-10 17:47:51 +00:00
Kubernetes Prow Robot
a379fafcaa
Merge pull request #1411 from ArangoGutierrez/postv0142
...
Update Readme to V0.14.2
2023-10-10 10:45:58 +02:00
Carlos Eduardo Arango Gutierrez
16b935a1a0
Update Readme to V0.14.2
...
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-10-10 10:38:08 +02:00
Kubernetes Prow Robot
c927bf5ecd
Merge pull request #1407 from marquiz/devel/gc-metrics
...
nfd-gc: add metrics
2023-10-10 09:06:13 +02:00
Markus Lehtonen
98c3b0750d
nfd-gc: add metrics
...
Implements three metrics for nfd-gc:
- nfd_gc_build_info: version information of nfd-gc.
- nfd_gc_objects_deleted_total: total number of NodeFeature and
NodeResourceTopology objects deleted by nfd-gc.
- nfd_gc_object_delete_failures_total: number of errors encountered when
deleting NodeFeature and NodeResourceTopology objects.
2023-10-09 13:39:28 +00:00
Kubernetes Prow Robot
44b26e39e4
Merge pull request #1400 from marquiz/devel/metrics-docs
...
docs: document nfd_topology_updater_build_info metric
2023-10-09 15:35:21 +02:00
Markus Lehtonen
f0a3581ca3
docs: document nfd_topology_updater_build_info metric
2023-10-09 13:06:36 +00:00
Kubernetes Prow Robot
4d30205767
Merge pull request #1406 from marquiz/docs/metrics
...
docs: clarify nfd_node_update_requests_total metric
2023-10-09 15:02:43 +02:00
Markus Lehtonen
24574724e2
docs: clarify nfd_node_update_requests_total metric
2023-10-09 12:45:03 +00:00
Kubernetes Prow Robot
3469e91390
Merge pull request #1402 from marquiz/devel/deps
...
go.mod: bump kubernetes to v1.28.2
2023-10-09 14:06:01 +02:00
Markus Lehtonen
9130b5e7cf
go.mod: bump kubernetes to v1.28.2
2023-10-09 11:06:49 +00:00
Kubernetes Prow Robot
300a5f5c74
Merge pull request #1399 from marquiz/devel/gc-metrics
...
nfd-gc: simplify initialization
2023-10-09 11:45:46 +02:00
Markus Lehtonen
f5c6ce2843
nfd-gc: simplify initialization
2023-10-09 11:48:49 +03:00
Kubernetes Prow Robot
899939b4ed
Merge pull request #1398 from marquiz/devel/metrics
...
Refactor metrics
2023-10-09 10:42:41 +02:00
Markus Lehtonen
5171ae0f90
Refactor metrics
...
Move common boilerplate code under pkg/utils.
2023-10-09 10:49:12 +03:00
Kubernetes Prow Robot
83c7096bbe
Merge pull request #1394 from marquiz/devel/annotations
...
nfd-master: stop creating NFD version annotations
2023-10-05 15:40:11 +02:00
Markus Lehtonen
7d1df87305
source/custom: drop support for the legacy rule format
2023-10-05 16:15:37 +03:00
Markus Lehtonen
1d8a83b045
nfd-master: stop creating NFD version annotations
...
We now have metrics for getting detailed information about the NFD
instances running. There should be no need to pollute the node object
with NFD version annotations.
One problem with the annotations also that they were incomplete in the
sense that they only covered nfd-master and nfd-worker but not
nfd-topology-updater or nfd-gc.
Also, there was a problem with stale annotations, giving misleading
information. E.g. there was no way to remove old/stale master.version
annotations if nfd-master was scheduled on another node where it was
previously running.
2023-10-05 14:53:29 +03:00
Kubernetes Prow Robot
160c9107f5
Merge pull request #1390 from ArangoGutierrez/devel/g121
...
Bump to Go 1.21
2023-10-05 13:14:46 +02:00
Kubernetes Prow Robot
7c0913ed7d
Merge pull request #1393 from marquiz/devel/annotations-fix
...
nfd-master: correctly clean up annotations
2023-10-05 10:45:40 +02:00
Markus Lehtonen
9ea0a1b420
nfd-master: correctly clean up annotations
...
Delete correct annotations if -instance is specified.
2023-10-05 11:10:06 +03:00
Kubernetes Prow Robot
7f6fd05357
Merge pull request #1392 from shivamerla/fix_gc_serviceaccount
...
Fix serviceaccount handling for nfd-gc to be consistent with others
2023-10-05 09:44:59 +02:00
Kubernetes Prow Robot
4b3429def5
Merge pull request #1386 from AhmedGrati/feat-support-raw-features
...
feat: support raw features
2023-10-05 09:00:44 +02:00
Shiva Krishna, Merla
6237b821f6
Fix serviceaccount handling for nfd-gc to be consistent with others
...
Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>
2023-10-04 15:17:32 -07:00
AhmedGrati
3130898d58
feat: support raw features
...
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-10-04 22:37:42 +01:00
Carlos Eduardo Arango Gutierrez
73d624def4
Update Makefile
...
Co-authored-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-10-04 18:34:55 +02:00
Carlos Eduardo Arango Gutierrez
dd8d7f6725
Helm - service to be only deployed when needed ( #1389 )
...
* Helm - service to be only deployed when needed
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
* Update deployment/helm/node-feature-discovery/templates/service.yaml
Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
---------
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
2023-10-04 16:00:43 +02:00
Carlos Eduardo Arango Gutierrez
3bb687d3d3
Bump to Go 1.21
...
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-10-04 11:39:32 +02:00
Kubernetes Prow Robot
4ad6491e0b
Merge pull request #1387 from ArangoGutierrez/devel/helm/grpcdepre2
...
Helm - Move remaining gPRC related flags to conditional
2023-10-03 08:03:24 +02:00
Carlos Eduardo Arango Gutierrez
3543aa22ce
Helm - Move remaining gPRC related flags to conditional
...
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-10-03 07:32:52 +02:00
Kubernetes Prow Robot
076ed3c057
Merge pull request #1382 from marquiz/devel/api-cleanup
...
apis/nfd: drop one stale comment line
2023-09-28 06:04:36 -07:00
Markus Lehtonen
dbf00dcda6
apis/nfd: drop one stale comment line
...
Drop a leftover "docstring" comment that wasn't removed with the type it
refers to.
2023-09-27 14:23:12 +03:00
Kubernetes Prow Robot
02dd550d3b
Merge pull request #1378 from marquiz/devel/fix-er-filtering
...
nfd-master: fix filtering of extended resources
2023-09-27 01:27:22 -07:00
Markus Lehtonen
b09ce75c8e
nfd-master: fix filtering of extended resources
...
Fix a bug in checking the allowed ".feature.node.kubernetes.io" ns
suffix for extended resources. Also update e2e-tests to cover this case.
2023-09-27 10:55:11 +03:00