Markus Lehtonen
f5fc8d4782
README: update to v0.13.4
2023-09-01 15:10:20 +03:00
Kubernetes Prow Robot
48f37070ed
Merge pull request #1319 from marquiz/devel/docs-build-image
...
docs: use ruby docker image for building docs
2023-08-31 00:26:47 -07:00
Markus Lehtonen
ae1a95f395
docs: update docs build dependencies
...
Add webrick as that is needed. Also update other deps to their latest
versions.
2023-08-30 19:31:35 +03:00
Markus Lehtonen
8985e003b5
docs: use ruby docker image for building docs
...
Get away with the jekyll:3.8 image which is already four years old. Use
the ruby instead. The jekyll image did not bring any value (more
problems, if anything) as we install/update jekyll and all other gems
with byndler nevertheless (image was jekyll:3.8 but we use jekyll
v3.9.3).
2023-08-30 19:31:35 +03:00
Kubernetes Prow Robot
8cf4a21d62
Merge pull request #1320 from marquiz/devel/github-golangci-lint
...
Makefile: increase golangci-lint timeout to 10min
2023-08-30 07:56:48 -07:00
Markus Lehtonen
2c8a6208f4
Makefile: increase golangci-lint timeout to 10min
2023-08-30 12:53:24 +03:00
Kubernetes Prow Robot
194e5cc056
Merge pull request #1315 from marquiz/devel/k8s-version
...
go.mod: update kubernetes to v1.28.1
2023-08-29 02:31:21 -07:00
Markus Lehtonen
4d9259d6cb
go.mod: update kubernetes to v1.28.1
2023-08-28 18:48:27 +03:00
Kubernetes Prow Robot
a658c54de3
Merge pull request #1297 from marquiz/devel/topology-updater-version
...
topology-updater: make -version always runnable
2023-08-28 04:05:43 -07:00
Kubernetes Prow Robot
e1f90a233b
Merge pull request #1305 from marquiz/devel/nf-gc
...
Garbage collection of NodeFeature objects
2023-08-28 02:59:42 -07:00
Kubernetes Prow Robot
6d95e59cd0
Merge pull request #1290 from marquiz/devel/metrics-new
...
metrics: additional metrics for nfd-master
2023-08-28 02:07:42 -07:00
Markus Lehtonen
a15b5690b6
docs: update to cover nfd-gc
2023-08-23 10:56:12 +03:00
Markus Lehtonen
ceb672bde0
deployment/helm: support nfd-gc
...
Rename files and parameters. Drop the container security context
parameters from the Helm chart. There should be no reason to run the
nfd-gc with other than the minimal privileges.
Also updates the documentation.
2023-08-23 10:56:12 +03:00
Markus Lehtonen
6cf29bd8ef
deployment/kustomize: support nfd-gc
...
Rename the old "topology-gc" to just "gc". Simplify the setup a bit by
including the RBAC rules in the "gc" base.
Note: we don't enable nfd-gc in the default overlay, yet, as the
NodeFeature API isn't enabled (gc is not needed).
2023-08-23 10:56:12 +03:00
Markus Lehtonen
f9fadd2102
test/e2e: add e2e test for nfd-gc
2023-08-22 21:24:26 +03:00
Markus Lehtonen
e3415ec484
nfd-gc: support garbage collection of NodeFeatures
...
Hook into the same logic already exercised for NodeResourceTopology
objects: GC watches for node delete events and immediately drops stale
objects (NRT and now also NF). In addition there is a periodic resync to
catch any missed node deletes, once every hour by default.
2023-08-22 21:24:26 +03:00
Markus Lehtonen
01c08d67b6
Rename nfd-topology-gc to nfd-gc
...
This is preparation for making it a generic garbage collector for all
nfd-managed api objects.
2023-08-21 21:46:11 +03:00
Kubernetes Prow Robot
e0c477090b
Merge pull request #1311 from marquiz/devel/refactor-gc-5
...
topology-gc: simplify listing of node objects
2023-08-21 11:40:05 -07:00
Kubernetes Prow Robot
277f54ae99
Merge pull request #1308 from marquiz/devel/refactor-gc-2
...
topology-gc: move initial GC out of startNodeInformer()
2023-08-21 00:57:23 -07:00
Markus Lehtonen
f05b0e26ea
topology-gc: move initial GC out of startNodeInformer()
...
Small refactor. Contextually this feels more like under periodicGC().
2023-08-21 10:11:46 +03:00
Kubernetes Prow Robot
a60502a313
Merge pull request #1307 from marquiz/devel/refactor-gc
...
topology-gc: refactor unit tests
2023-08-21 00:09:23 -07:00
Kubernetes Prow Robot
536f9d17d0
Merge pull request #1295 from marquiz/devel/topology-updater-metrics
...
nfd-topology-updater: add metrics support
2023-08-20 23:25:24 -07:00
Markus Lehtonen
2e8da8849a
topology-gc: simplify listing of node objects
...
Hopefully makes the code slightly more readable.
2023-08-21 09:13:41 +03:00
Markus Lehtonen
0b5e51bd35
topology-gc: refactor unit tests
...
Remove a lot of boilerplate code by defining reusable functions.
Also, test the Run() method instead of the functions callees of Run() as
it is the top level functionality that was tested in practice (we don't
have separate unit tests for the callee functions).
2023-08-21 09:10:24 +03:00
Kubernetes Prow Robot
4674bce27d
Merge pull request #1310 from marquiz/devel/refactor-gc-4
...
topology-gc: rename runGC to garbageCollect()
2023-08-18 11:26:34 -07:00
Kubernetes Prow Robot
f4cf4877f2
Merge pull request #1309 from marquiz/devel/refactor-gc-3
...
topology-gc: rename run()
2023-08-18 11:26:28 -07:00
Kubernetes Prow Robot
b47667fc0c
Merge pull request #1306 from marquiz/devel/gc-fix-stop
...
topology-gc: fix Stop
2023-08-18 10:34:29 -07:00
Markus Lehtonen
ec51b29b3c
topology-gc: rename runGC to garbageCollect()
...
One less function named run.
2023-08-18 17:57:05 +03:00
Markus Lehtonen
98b0b36b87
topology-gc: rename run()
...
Too many run methods here.
2023-08-18 17:52:11 +03:00
Markus Lehtonen
108d603bdc
topology-gc: fix Stop
...
The stop channel has multiple readers to we need to close it so that all
of the readers get notified.
2023-08-18 17:46:54 +03:00
Kubernetes Prow Robot
fe0763eccb
Merge pull request #1303 from marquiz/devel/docs-deps
...
docs: update github-pages gem to v228
2023-08-16 09:40:27 -07:00
Markus Lehtonen
b64ba37377
docs: update github-pages gem to v228
...
Also update other dependencies.
2023-08-16 13:51:09 +03:00
Kubernetes Prow Robot
198eb2b5db
Merge pull request #1302 from marquiz/devel/deps
...
Update kubernetes to v1.28.0
2023-08-16 03:12:27 -07:00
Markus Lehtonen
2e79a015f5
test/e2e: align with latest kubernetes code base
2023-08-16 12:43:52 +03:00
Markus Lehtonen
5d5f133eff
go.mod: update kubernetes to v1.28.0
...
Also sync (update) other dependencies with what kubernetes v1.28 has.
2023-08-16 11:00:51 +03:00
Kubernetes Prow Robot
0bbf5f3f1e
Merge pull request #1300 from marquiz/devel/ci-lint
...
scripts/test-infra: bump golangci-lint to v1.54.0
2023-08-11 05:43:27 -07:00
Kubernetes Prow Robot
95069b410b
Merge pull request #1299 from marquiz/devel/logcheck
...
scripts/test-infra: update logcheck tool to v0.6.0
2023-08-11 05:15:29 -07:00
Markus Lehtonen
972374af0e
scripts/test-infra: bump golangci-lint to v1.54.0
...
Brings e.g. support for Go v1.21.
2023-08-11 11:42:23 +03:00
Markus Lehtonen
7e2a549db2
scripts/test-infra: update logcheck tool to v0.6.0
...
Update logcheck to the latest version. Fixes the flakiness we've been
experiencing.
2023-08-09 08:23:42 +03:00
Kubernetes Prow Robot
9d61b19454
Merge pull request #1287 from freelizhun/fix-empty-hugepages
...
fix empty hugepages in some numa nodes caused no such file or directory errors
2023-08-08 02:50:16 -07:00
lizhun
a4ad3d4411
fix empty hugepages in some numa nodes caused no such file or directory error
...
Signed-off-by: lizhun <lizhun@kylinos.cn>
2023-08-08 15:14:44 +08:00
Markus Lehtonen
5ba8d14b86
topology-updater: make -version always runnable
...
Make it possible to run -version in an environment whithout the
NODE_ADDRESS environment variable set.
2023-08-07 11:56:58 +03:00
Markus Lehtonen
5ad2294c14
metrics: add nfd_node_update_requests_total counter
...
Add a counter for total number of node update/sync requests. In
practice, this counts the number of gRPC requests received if the gRPC
API is in use. If the NodeFeature API is enabled, this counts the
requests initiated by the NFD API controller, i.e. updates triggered by
changes in NodeFeature or NodeFeatureRule objects plus updates initiated
by the controller resync period.
2023-08-07 09:37:29 +03:00
Markus Lehtonen
4b24cc1afa
metrics: counters for rejected labels, extended resources and taints
...
Add counters for labels, extended resources and taints rejected/filtered
out by nfd-master.
2023-08-07 09:37:29 +03:00
Markus Lehtonen
a8a29e6df2
metrics: add nfd_nodefeaturerule_processing_errors_total counter
...
Add a counter for errors encountered when processing NodeFeatureRules.
Another simple counter without any additional prometheus labels -
nfd-master logs can provide further details.
2023-08-07 09:37:29 +03:00
Markus Lehtonen
b90f2c318e
metrics: add nfd_node_update_failures_total counter
...
Add a new counter for tracking node update failures from nfd-master.
This tracks both normal feature updates and the --prune sub-command.
This is a simple counter without any additional labels - nfd-master logs
can be used for further diagnostics.
2023-08-07 09:37:27 +03:00
AhmedGrati
f0edc6532a
docs: add the support of the exipration date in the input format of the feature files
...
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-08-05 20:39:09 +01:00
AhmedGrati
bd3ccf1e33
feat: add support for feature files expiration
...
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-08-05 20:38:44 +01:00
Kubernetes Prow Robot
9ed191808d
Merge pull request #1296 from marquiz/docs/metrics
...
docs: document -metrics flag in command line reference
2023-08-05 03:06:30 -07:00
Kubernetes Prow Robot
6caf554b4c
Merge pull request #1291 from marquiz/devel/master-renaming
...
nfd-master: use term node update instead of labeling
2023-08-04 09:22:24 -07:00