1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2025-03-06 16:57:10 +00:00
Commit graph

1969 commits

Author SHA1 Message Date
Markus Lehtonen
039378c725 nfd-master: use term node update instead of labeling
Rename symbols and reword log messages to correlate with the
functionality (we may do other updates than just modify labels
nowadays).
2023-08-01 16:42:34 +03:00
Markus Lehtonen
d8f167d8a9 nfd-master: remove one stale empty line 2023-08-01 16:38:32 +03:00
Kubernetes Prow Robot
45dc46ab81
Merge pull request #1289 from marquiz/devel/metrics
docs: align metrics documentation with latest changes on naming
2023-08-01 06:20:39 -07:00
Markus Lehtonen
a1406767a9 docs: align metrics documentation with latest changes on naming
Also change table formatting and fix one incorrect description.
2023-08-01 15:53:06 +03:00
Kubernetes Prow Robot
c1cb63243b
Merge pull request #1288 from marquiz/devel/metrics
Improve metrics
2023-07-31 10:38:39 -07:00
Markus Lehtonen
5091fef84b metrics: improve feature discovery duration metric
Rename the "NodeName" prometheus label to  "node", aligning  with
common prometheus/kubernetes conventions. Also reconfigure the
prometheus histogram buckets (now 10ms to 1s) to better match the
expected sample range.
2023-07-31 19:45:22 +03:00
Markus Lehtonen
47f621d970 metrics: improve the node updates gauge
Rename the metric, better describe what we're measuring and better
comply with prometheus naming conventions. Also change it to represent
actual updates of the node object on the Kubernetes apiserver.
2023-07-31 19:45:22 +03:00
Markus Lehtonen
945e7fcb3f metrics: improve nfr processing time metric
Change the metric from a simple gauge (that basically was a single value
for the whole cluster) into a HistogramVec, aligning with the feature
discovery duration metric in nfd-worker. This improved metric now has
prometheus labels for the NFR name and node name, i.e. it is tracking
per-NFR metric for each node being processed. Also, change the naming to
better comply with prometheus suggested conventions.
2023-07-31 19:45:22 +03:00
Kubernetes Prow Robot
01ca8cb91d
Merge pull request #1284 from marquiz/devel/generator-deps
generate: bump tools to their latest versions
2023-07-31 06:32:39 -07:00
Kubernetes Prow Robot
e0f10a81de
Merge pull request #1256 from PiotrProkop/fix-topo-updater-policy-and-scope-advertisment
Fix Topology Manager policy and scope not being updated after NRT creation
2023-07-28 00:25:54 -07:00
Markus Lehtonen
7e375ad1f0 generate: bump tools to their latest versions
Bump tools versions and re-auto-generate files.
2023-07-27 14:29:48 +03:00
Kubernetes Prow Robot
65b7216313
Merge pull request #1283 from marquiz/docs/deprecation-policy
docs: deprecation policy for Helm chart params
2023-07-25 10:46:06 -07:00
Kubernetes Prow Robot
463a737b82
Merge pull request #1277 from marquiz/docs/k8s-compat
docs: describe supported Kubernetes versions
2023-07-25 08:54:06 -07:00
Markus Lehtonen
b1328b3166 docs: describe supported Kubernetes versions 2023-07-25 17:40:06 +03:00
Markus Lehtonen
b72b537261 docs: deprecation policy for Helm chart params 2023-07-24 14:06:30 +03:00
Kubernetes Prow Robot
73bdaa2e89
Merge pull request #1282 from jcpunk/podmon-labels
Add optional labels to the podmonitor
2023-07-24 03:40:12 -07:00
Pat Riehecky
0523257d1a Add optional labels to the podmonitor
Signed-off-by: Pat Riehecky <riehecky@fnal.gov>
2023-07-21 10:03:50 -05:00
Kubernetes Prow Robot
c9f3550237
Merge pull request #1280 from marquiz/docs/tocs
docs: remove useless TOCs
2023-07-21 06:50:15 -07:00
Kubernetes Prow Robot
ebbea564a8
Merge pull request #1278 from marquiz/docs/fixes
docs: fix toc of topology-updater and topology-gc reference
2023-07-21 06:50:08 -07:00
Kubernetes Prow Robot
e195e8563f
Merge pull request #1279 from marquiz/docs/version-policy
docs: document version and deprecation policy
2023-07-21 06:44:08 -07:00
Markus Lehtonen
312ef308d1 docs: remove useless TOCs
Drop table of contents from short pages where it is only cluttering the
page.
2023-07-21 16:35:12 +03:00
Markus Lehtonen
f825812229 docs: document version and deprecation policy 2023-07-21 16:28:38 +03:00
Markus Lehtonen
d4d6963473 docs: fix toc of topology-updater and topology-gc reference
Exclude the main title from to (with the empty line the "no_toc"
directive took no effect).
2023-07-21 15:41:59 +03:00
Kubernetes Prow Robot
5223d1f77f
Merge pull request #1276 from marquiz/devel/readme
README: update to v0.13.3
2023-07-21 03:22:09 -07:00
Markus Lehtonen
ad27cdcc83 README: update to v0.13.3 2023-07-21 13:14:46 +03:00
Kubernetes Prow Robot
77d869c4f7
Merge pull request #1242 from ArangoGutierrez/metrics
Enable metrics via prometheus operator
2023-07-21 02:26:08 -07:00
Carlos Eduardo Arango Gutierrez
e3aedd33e2
Enable metrics via prometheus operator
Expose metrics via prometheus.monitoring.coreos.com/v1

The exposed metrics are

| Metric        | Type | Meaning |
| --------------- | ---------------- | ---------------- |
|  `nfd_master_build_info`           | Gauge | Version from which nfd-master was built. |
|  `nfd_worker_build_info`           | Gauge | Version from which nfd-worker was built. |
|  `nfd_updated_nodes`           |  Counter | Time taken to label a node |
|  `nfd_crd_processing_time`          |  Gauge | Time taken to process a NodeFeatureRule CRD |
| `nfd_feature_discovery_duration_seconds` |  HistogramVec | Time taken to discover features on a node |

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
2023-07-21 10:59:52 +02:00
Kubernetes Prow Robot
1868242169
Merge pull request #1274 from marquiz/devel/gh-templates
github: update assignees in new-release issue template
2023-07-21 00:04:07 -07:00
Markus Lehtonen
415c7981f3 github: update assignees in new-release issue template
Sync with OWNERS file.
2023-07-21 09:06:42 +03:00
pprokop
6d98b6150b Fix Topology Manager policy and scope not being updated properly
NFD is only detecting policy and scope of Topology Manager when NRT object doesn't exist.
This means that topologyManagerScope and topologyManagerPolicy attributes won't be updated
even if kubelet config was changed to use other TopologyManager policy and scope.

Signed-off-by: pprokop <pprokop@nvidia.com>
2023-07-20 16:31:12 +02:00
Kubernetes Prow Robot
195e7908f1
Merge pull request #1268 from marquiz/devel/deps
go.mod: update kubernetes to v1.27.4
2023-07-20 05:40:07 -07:00
Markus Lehtonen
045eb28dbe go.mod: update kubernetes to v1.27.4 2023-07-20 14:29:03 +03:00
Kubernetes Prow Robot
fd0ba3f9d9
Merge pull request #1265 from fidencio/topic/cpu-misc-cgroups-take-cgroupsv1-into-account
cpu: Take cgroupsv1 into account when reading misc.capacity
2023-07-19 06:12:05 -07:00
AhmedGrati
8e55d78d85 test: add node updater pool unit tests
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-07-19 12:03:35 +01:00
Fabiano Fidêncio
7532ac3192 cpu: Add retrieveCgroupMiscCapacityValue() for legibility
Let's refactor part of the getCgroupMiscCapacity() out to its own
retrieveCgroupMiscCapacityValue(), for the legibility sake.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-19 12:03:27 +02:00
Fabiano Fidêncio
8ed5a2343f cpu: Take cgroupsv1 into account when reading misc.capacity
We've been only considering cgroupsv2 when trying to read misc.capacity.
However, there are still a bunch of systems out there relying on
cgroupsv1.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-19 10:49:53 +02:00
Kubernetes Prow Robot
5f181cc6d0
Merge pull request #1258 from marquiz/fixes/nfd-master
nfd-master: check for nil references in nfdAPIUpdateAllNodes
2023-07-18 05:23:09 -07:00
Markus Lehtonen
dac45be28c nfd-master: check for nil references in nfdAPIUpdateAllNodes
Just a safeguard.
2023-07-17 17:49:44 +03:00
Kubernetes Prow Robot
9a108c0505
Merge pull request #1255 from hangscer8/clean_ticker
Stop ticker in time to avoid memory leak
2023-07-06 01:59:03 -07:00
hang.jiang
698031fc2d Stop ticker in time to avoid memory leak
Because it will cause memory leak if we do not stop ticker when the function has completed.

Signed-off-by: hang.jiang <hang.jiang@daocloud.io>
2023-07-05 18:35:01 +08:00
Kubernetes Prow Robot
f02d172d07
Merge pull request #1253 from adrianchiris/fix-typo-in-helm-template
fix typo in helm chart
2023-07-03 02:46:53 -07:00
adrianc
904f3739a3
fix typo in helm chart
else statement of crd-controller should
also refer to crd-controller flag.

Signed-off-by: adrianc <adrianc@nvidia.com>
2023-07-02 18:01:31 +03:00
Kubernetes Prow Robot
10bbc8f253
Merge pull request #1248 from testwill/pkg-import
Remove pkg's imported twice
2023-06-28 05:54:32 -07:00
guoguangwu
29118f67bb fix: Drop the e2elog instead
Signed-off-by: guoguangwu <guoguangwu@magic-shield.com>
2023-06-25 09:44:08 +08:00
Kubernetes Prow Robot
407a610e0c
Merge pull request #1182 from fmuyassarov/disable-hooks-by-default
hooks: disable hooks by default from v0.14
2023-06-22 04:43:40 -07:00
guoguangwu
92482e45d8 node_feature_discovery_test.go rm pkg imported twice
Signed-off-by: guoguangwu <guoguangwu@magic-shield.com>
2023-06-21 16:55:25 +08:00
guoguangwu
b946bcc0f5 nfd-master-internal_test.go rm pkg imported twice
Signed-off-by: guoguangwu <guoguangwu@magic-shield.com>
2023-06-21 16:53:55 +08:00
Kubernetes Prow Robot
aa55cd5999
Merge pull request #1247 from ArangoGutierrez/fix_docs_typo
Docs: Fix typo on customization-guide
2023-06-09 01:38:13 -07:00
Carlos Eduardo Arango Gutierrez
563cc862de
Docs: Fix typo on customization-guide
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-06-09 10:23:33 +02:00
Kubernetes Prow Robot
ad1bf43d25
Merge pull request #1246 from dipankardas011/fix-depricated-use-of-base-in-kustomize
Removal of the bases field as it is deprecated by kustomize
2023-06-09 00:30:13 -07:00