node-feature-discovery

mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2025-03-05 16:27:05 +00:00

Author	SHA1	Message	Date
Markus Lehtonen	5ad2294c14	metrics: add nfd_node_update_requests_total counter Add a counter for total number of node update/sync requests. In practice, this counts the number of gRPC requests received if the gRPC API is in use. If the NodeFeature API is enabled, this counts the requests initiated by the NFD API controller, i.e. updates triggered by changes in NodeFeature or NodeFeatureRule objects plus updates initiated by the controller resync period.	2023-08-07 09:37:29 +03:00
Markus Lehtonen	4b24cc1afa	metrics: counters for rejected labels, extended resources and taints Add counters for labels, extended resources and taints rejected/filtered out by nfd-master.	2023-08-07 09:37:29 +03:00
Markus Lehtonen	a8a29e6df2	metrics: add nfd_nodefeaturerule_processing_errors_total counter Add a counter for errors encountered when processing NodeFeatureRules. Another simple counter without any additional prometheus labels - nfd-master logs can provide further details.	2023-08-07 09:37:29 +03:00
Markus Lehtonen	b90f2c318e	metrics: add nfd_node_update_failures_total counter Add a new counter for tracking node update failures from nfd-master. This tracks both normal feature updates and the --prune sub-command. This is a simple counter without any additional labels - nfd-master logs can be used for further diagnostics.	2023-08-07 09:37:27 +03:00
Kubernetes Prow Robot	9ed191808d	Merge pull request #1296 from marquiz/docs/metrics docs: document -metrics flag in command line reference	2023-08-05 03:06:30 -07:00
Kubernetes Prow Robot	6caf554b4c	Merge pull request #1291 from marquiz/devel/master-renaming nfd-master: use term node update instead of labeling	2023-08-04 09:22:24 -07:00
Kubernetes Prow Robot	35fbbaae99	Merge pull request #1294 from marquiz/devel/feature-file-comments source/local: support comments in input	2023-08-04 07:58:21 -07:00
Markus Lehtonen	4b7ee47e5f	docs: document -metrics flag in command line reference Document the -metrics command line flag in the command line reference of nfd-master and nfd-worker.	2023-08-04 16:49:03 +03:00
Markus Lehtonen	4aa7a8f8f8	source/local: support comments in input Lines starting with '#' are treated as comments and ignored when parsing feature files and hook output.	2023-08-04 16:46:22 +03:00
Kubernetes Prow Robot	6d30ca9660	Merge pull request #1293 from marquiz/devel/feature-file-whitespace source/local: trim whitespace from input	2023-08-04 06:38:24 -07:00
Kubernetes Prow Robot	1fa4178798	Merge pull request #1292 from marquiz/docs/notes docs: unify formatting of NOTEs	2023-08-04 06:10:22 -07:00
Markus Lehtonen	181b4e0168	source/local: trim whitespace from input Trim leading and trailing whitespace from the input (from feature files and hooks). Makes it a bit more relaxed on the expected input format.	2023-08-04 15:24:46 +03:00
Markus Lehtonen	0a8b514d67	docs: unify formatting of NOTEs	2023-08-03 15:36:56 +03:00
Markus Lehtonen	039378c725	nfd-master: use term node update instead of labeling Rename symbols and reword log messages to correlate with the functionality (we may do other updates than just modify labels nowadays).	2023-08-01 16:42:34 +03:00
Markus Lehtonen	d8f167d8a9	nfd-master: remove one stale empty line	2023-08-01 16:38:32 +03:00
Kubernetes Prow Robot	45dc46ab81	Merge pull request #1289 from marquiz/devel/metrics docs: align metrics documentation with latest changes on naming	2023-08-01 06:20:39 -07:00
Markus Lehtonen	a1406767a9	docs: align metrics documentation with latest changes on naming Also change table formatting and fix one incorrect description.	2023-08-01 15:53:06 +03:00
Kubernetes Prow Robot	c1cb63243b	Merge pull request #1288 from marquiz/devel/metrics Improve metrics	2023-07-31 10:38:39 -07:00
Markus Lehtonen	5091fef84b	metrics: improve feature discovery duration metric Rename the "NodeName" prometheus label to "node", aligning with common prometheus/kubernetes conventions. Also reconfigure the prometheus histogram buckets (now 10ms to 1s) to better match the expected sample range.	2023-07-31 19:45:22 +03:00
Markus Lehtonen	47f621d970	metrics: improve the node updates gauge Rename the metric, better describe what we're measuring and better comply with prometheus naming conventions. Also change it to represent actual updates of the node object on the Kubernetes apiserver.	2023-07-31 19:45:22 +03:00
Markus Lehtonen	945e7fcb3f	metrics: improve nfr processing time metric Change the metric from a simple gauge (that basically was a single value for the whole cluster) into a HistogramVec, aligning with the feature discovery duration metric in nfd-worker. This improved metric now has prometheus labels for the NFR name and node name, i.e. it is tracking per-NFR metric for each node being processed. Also, change the naming to better comply with prometheus suggested conventions.	2023-07-31 19:45:22 +03:00
Kubernetes Prow Robot	01ca8cb91d	Merge pull request #1284 from marquiz/devel/generator-deps generate: bump tools to their latest versions	2023-07-31 06:32:39 -07:00
Kubernetes Prow Robot	e0f10a81de	Merge pull request #1256 from PiotrProkop/fix-topo-updater-policy-and-scope-advertisment Fix Topology Manager policy and scope not being updated after NRT creation	2023-07-28 00:25:54 -07:00
Markus Lehtonen	7e375ad1f0	generate: bump tools to their latest versions Bump tools versions and re-auto-generate files.	2023-07-27 14:29:48 +03:00
Kubernetes Prow Robot	65b7216313	Merge pull request #1283 from marquiz/docs/deprecation-policy docs: deprecation policy for Helm chart params	2023-07-25 10:46:06 -07:00
Kubernetes Prow Robot	463a737b82	Merge pull request #1277 from marquiz/docs/k8s-compat docs: describe supported Kubernetes versions	2023-07-25 08:54:06 -07:00
Markus Lehtonen	b1328b3166	docs: describe supported Kubernetes versions	2023-07-25 17:40:06 +03:00
Markus Lehtonen	b72b537261	docs: deprecation policy for Helm chart params	2023-07-24 14:06:30 +03:00
Kubernetes Prow Robot	73bdaa2e89	Merge pull request #1282 from jcpunk/podmon-labels Add optional labels to the podmonitor	2023-07-24 03:40:12 -07:00
Pat Riehecky	0523257d1a	Add optional labels to the podmonitor Signed-off-by: Pat Riehecky <riehecky@fnal.gov>	2023-07-21 10:03:50 -05:00
Kubernetes Prow Robot	c9f3550237	Merge pull request #1280 from marquiz/docs/tocs docs: remove useless TOCs	2023-07-21 06:50:15 -07:00
Kubernetes Prow Robot	ebbea564a8	Merge pull request #1278 from marquiz/docs/fixes docs: fix toc of topology-updater and topology-gc reference	2023-07-21 06:50:08 -07:00
Kubernetes Prow Robot	e195e8563f	Merge pull request #1279 from marquiz/docs/version-policy docs: document version and deprecation policy	2023-07-21 06:44:08 -07:00
Markus Lehtonen	312ef308d1	docs: remove useless TOCs Drop table of contents from short pages where it is only cluttering the page.	2023-07-21 16:35:12 +03:00
Markus Lehtonen	f825812229	docs: document version and deprecation policy	2023-07-21 16:28:38 +03:00
Markus Lehtonen	d4d6963473	docs: fix toc of topology-updater and topology-gc reference Exclude the main title from to (with the empty line the "no_toc" directive took no effect).	2023-07-21 15:41:59 +03:00
Kubernetes Prow Robot	5223d1f77f	Merge pull request #1276 from marquiz/devel/readme README: update to v0.13.3	2023-07-21 03:22:09 -07:00
Markus Lehtonen	ad27cdcc83	README: update to v0.13.3	2023-07-21 13:14:46 +03:00
Kubernetes Prow Robot	77d869c4f7	Merge pull request #1242 from ArangoGutierrez/metrics Enable metrics via prometheus operator	2023-07-21 02:26:08 -07:00
Carlos Eduardo Arango Gutierrez	e3aedd33e2	Enable metrics via prometheus operator Expose metrics via prometheus.monitoring.coreos.com/v1 The exposed metrics are \| Metric \| Type \| Meaning \| \| --------------- \| ---------------- \| ---------------- \| \| `nfd_master_build_info` \| Gauge \| Version from which nfd-master was built. \| \| `nfd_worker_build_info` \| Gauge \| Version from which nfd-worker was built. \| \| `nfd_updated_nodes` \| Counter \| Time taken to label a node \| \| `nfd_crd_processing_time` \| Gauge \| Time taken to process a NodeFeatureRule CRD \| \| `nfd_feature_discovery_duration_seconds` \| HistogramVec \| Time taken to discover features on a node \| Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>	2023-07-21 10:59:52 +02:00
Kubernetes Prow Robot	1868242169	Merge pull request #1274 from marquiz/devel/gh-templates github: update assignees in new-release issue template	2023-07-21 00:04:07 -07:00
Markus Lehtonen	415c7981f3	github: update assignees in new-release issue template Sync with OWNERS file.	2023-07-21 09:06:42 +03:00
pprokop	6d98b6150b	Fix Topology Manager policy and scope not being updated properly NFD is only detecting policy and scope of Topology Manager when NRT object doesn't exist. This means that topologyManagerScope and topologyManagerPolicy attributes won't be updated even if kubelet config was changed to use other TopologyManager policy and scope. Signed-off-by: pprokop <pprokop@nvidia.com>	2023-07-20 16:31:12 +02:00
Kubernetes Prow Robot	195e7908f1	Merge pull request #1268 from marquiz/devel/deps go.mod: update kubernetes to v1.27.4	2023-07-20 05:40:07 -07:00
Markus Lehtonen	045eb28dbe	go.mod: update kubernetes to v1.27.4	2023-07-20 14:29:03 +03:00
Kubernetes Prow Robot	fd0ba3f9d9	Merge pull request #1265 from fidencio/topic/cpu-misc-cgroups-take-cgroupsv1-into-account cpu: Take cgroupsv1 into account when reading misc.capacity	2023-07-19 06:12:05 -07:00
Fabiano Fidêncio	7532ac3192	cpu: Add retrieveCgroupMiscCapacityValue() for legibility Let's refactor part of the getCgroupMiscCapacity() out to its own retrieveCgroupMiscCapacityValue(), for the legibility sake. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-19 12:03:27 +02:00
Fabiano Fidêncio	8ed5a2343f	cpu: Take cgroupsv1 into account when reading misc.capacity We've been only considering cgroupsv2 when trying to read misc.capacity. However, there are still a bunch of systems out there relying on cgroupsv1. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-07-19 10:49:53 +02:00
Kubernetes Prow Robot	5f181cc6d0	Merge pull request #1258 from marquiz/fixes/nfd-master nfd-master: check for nil references in nfdAPIUpdateAllNodes	2023-07-18 05:23:09 -07:00
Markus Lehtonen	dac45be28c	nfd-master: check for nil references in nfdAPIUpdateAllNodes Just a safeguard.	2023-07-17 17:49:44 +03:00

1 2 3 4 5 ...

1831 commits