node-feature-discovery

mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2025-03-16 21:38:23 +00:00

Author	SHA1	Message	Date
Francesco Romani	727875f240	docs: nfd-updater: clarify accounting Clarify that we account, and we can account, only resources exclusively allocated to Guaranteed QoS pods. Signed-off-by: Francesco Romani <fromani@redhat.com>	2023-09-04 08:51:14 +02:00
AhmedGrati	47aec15ea1	test: add unit tests for the expiration date function Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-09-01 20:04:24 +01:00
Markus Lehtonen	ae1a95f395	docs: update docs build dependencies Add webrick as that is needed. Also update other deps to their latest versions.	2023-08-30 19:31:35 +03:00
Kubernetes Prow Robot	e1f90a233b	Merge pull request #1305 from marquiz/devel/nf-gc Garbage collection of NodeFeature objects	2023-08-28 02:59:42 -07:00
Kubernetes Prow Robot	6d95e59cd0	Merge pull request #1290 from marquiz/devel/metrics-new metrics: additional metrics for nfd-master	2023-08-28 02:07:42 -07:00
Markus Lehtonen	a15b5690b6	docs: update to cover nfd-gc	2023-08-23 10:56:12 +03:00
Markus Lehtonen	ceb672bde0	deployment/helm: support nfd-gc Rename files and parameters. Drop the container security context parameters from the Helm chart. There should be no reason to run the nfd-gc with other than the minimal privileges. Also updates the documentation.	2023-08-23 10:56:12 +03:00
Kubernetes Prow Robot	536f9d17d0	Merge pull request #1295 from marquiz/devel/topology-updater-metrics nfd-topology-updater: add metrics support	2023-08-20 23:25:24 -07:00
Markus Lehtonen	b64ba37377	docs: update github-pages gem to v228 Also update other dependencies.	2023-08-16 13:51:09 +03:00
Markus Lehtonen	5ad2294c14	metrics: add nfd_node_update_requests_total counter Add a counter for total number of node update/sync requests. In practice, this counts the number of gRPC requests received if the gRPC API is in use. If the NodeFeature API is enabled, this counts the requests initiated by the NFD API controller, i.e. updates triggered by changes in NodeFeature or NodeFeatureRule objects plus updates initiated by the controller resync period.	2023-08-07 09:37:29 +03:00
Markus Lehtonen	4b24cc1afa	metrics: counters for rejected labels, extended resources and taints Add counters for labels, extended resources and taints rejected/filtered out by nfd-master.	2023-08-07 09:37:29 +03:00
Markus Lehtonen	a8a29e6df2	metrics: add nfd_nodefeaturerule_processing_errors_total counter Add a counter for errors encountered when processing NodeFeatureRules. Another simple counter without any additional prometheus labels - nfd-master logs can provide further details.	2023-08-07 09:37:29 +03:00
Markus Lehtonen	b90f2c318e	metrics: add nfd_node_update_failures_total counter Add a new counter for tracking node update failures from nfd-master. This tracks both normal feature updates and the --prune sub-command. This is a simple counter without any additional labels - nfd-master logs can be used for further diagnostics.	2023-08-07 09:37:27 +03:00
AhmedGrati	f0edc6532a	docs: add the support of the exipration date in the input format of the feature files Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-08-05 20:39:09 +01:00
Kubernetes Prow Robot	9ed191808d	Merge pull request #1296 from marquiz/docs/metrics docs: document -metrics flag in command line reference	2023-08-05 03:06:30 -07:00
Markus Lehtonen	4b7ee47e5f	docs: document -metrics flag in command line reference Document the -metrics command line flag in the command line reference of nfd-master and nfd-worker.	2023-08-04 16:49:03 +03:00
Markus Lehtonen	06b333db1e	nfd-topology-updater: add metrics support For now, add only one metric, a counter for the errors occurring while scanning pod resources on the node.	2023-08-04 16:48:37 +03:00
Markus Lehtonen	4aa7a8f8f8	source/local: support comments in input Lines starting with '#' are treated as comments and ignored when parsing feature files and hook output.	2023-08-04 16:46:22 +03:00
Markus Lehtonen	0a8b514d67	docs: unify formatting of NOTEs	2023-08-03 15:36:56 +03:00
Markus Lehtonen	a1406767a9	docs: align metrics documentation with latest changes on naming Also change table formatting and fix one incorrect description.	2023-08-01 15:53:06 +03:00
Kubernetes Prow Robot	65b7216313	Merge pull request #1283 from marquiz/docs/deprecation-policy docs: deprecation policy for Helm chart params	2023-07-25 10:46:06 -07:00
Kubernetes Prow Robot	463a737b82	Merge pull request #1277 from marquiz/docs/k8s-compat docs: describe supported Kubernetes versions	2023-07-25 08:54:06 -07:00
Markus Lehtonen	b1328b3166	docs: describe supported Kubernetes versions	2023-07-25 17:40:06 +03:00
Markus Lehtonen	b72b537261	docs: deprecation policy for Helm chart params	2023-07-24 14:06:30 +03:00
Pat Riehecky	0523257d1a	Add optional labels to the podmonitor Signed-off-by: Pat Riehecky <riehecky@fnal.gov>	2023-07-21 10:03:50 -05:00
Kubernetes Prow Robot	c9f3550237	Merge pull request #1280 from marquiz/docs/tocs docs: remove useless TOCs	2023-07-21 06:50:15 -07:00
Kubernetes Prow Robot	ebbea564a8	Merge pull request #1278 from marquiz/docs/fixes docs: fix toc of topology-updater and topology-gc reference	2023-07-21 06:50:08 -07:00
Markus Lehtonen	312ef308d1	docs: remove useless TOCs Drop table of contents from short pages where it is only cluttering the page.	2023-07-21 16:35:12 +03:00
Markus Lehtonen	f825812229	docs: document version and deprecation policy	2023-07-21 16:28:38 +03:00
Markus Lehtonen	d4d6963473	docs: fix toc of topology-updater and topology-gc reference Exclude the main title from to (with the empty line the "no_toc" directive took no effect).	2023-07-21 15:41:59 +03:00
Carlos Eduardo Arango Gutierrez	e3aedd33e2	Enable metrics via prometheus operator Expose metrics via prometheus.monitoring.coreos.com/v1 The exposed metrics are \| Metric \| Type \| Meaning \| \| --------------- \| ---------------- \| ---------------- \| \| `nfd_master_build_info` \| Gauge \| Version from which nfd-master was built. \| \| `nfd_worker_build_info` \| Gauge \| Version from which nfd-worker was built. \| \| `nfd_updated_nodes` \| Counter \| Time taken to label a node \| \| `nfd_crd_processing_time` \| Gauge \| Time taken to process a NodeFeatureRule CRD \| \| `nfd_feature_discovery_duration_seconds` \| HistogramVec \| Time taken to discover features on a node \| Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>	2023-07-21 10:59:52 +02:00
Kubernetes Prow Robot	407a610e0c	Merge pull request #1182 from fmuyassarov/disable-hooks-by-default hooks: disable hooks by default from v0.14	2023-06-22 04:43:40 -07:00
Carlos Eduardo Arango Gutierrez	563cc862de	Docs: Fix typo on customization-guide Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2023-06-09 10:23:33 +02:00
Muyassarov, Feruzjon	19527be924	hooks: disable hooks by default We have deprecated hooks in v0.12.0 but kept it enabled by default. Starting from v0.14 we are starting to disable it by default and plan to fully remove it in the near future. Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2023-06-07 13:04:23 +03:00
Simon Jürgensmeyer	307a865465	Fix missing apostrophe for jq	2023-06-07 09:53:02 +02:00
Hairong Chen	e8a00ba7da	cpu: Discover TDX guests based on cpuid information NFD already has the capability to discover whether baremetal / host machines support Intel TDX. Now, the next step is to add support for discovering whether a node is TDX protected (as in, a virtual machine started using Intel TDX). In order to do so, we've decided to go for a new `cpu-security.tdx` property, called `protected` (`cpu-security.tdx.protected`). Signed-off-by: Hairong Chen <hairong.chen@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-06-05 11:06:28 +02:00
Kubernetes Prow Robot	306969a945	Merge pull request #1133 from AhmedGrati/feat-parallelize-nodes-update feat: parallelize nodes update	2023-06-02 05:28:57 -07:00
AhmedGrati	b3cfe17392	feat: parallelize nodes update This PR aims to optimize the process of updating nodes with corresponding features. In fact, previously, we were updating nodes sequentially even though they are independent from each other. Therefore, we integrated new components: LabelersNodePool which is responsible for spininng a goroutine whenever there's a request for updating nodes, and a Workqueue which is responsible for holding nodes names that should be updated. Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-06-02 11:41:50 +01:00
AhmedGrati	08b9c3486e	feat: support dynamic values for labels in the NodeFeatureRule This PR aims to support the dynamic values for labels in the NodeFeatureRule CRD, it would offer more flexible labeling for users. To achieve this, we check whether label value starts with "@", and if it's the case, we will get the value of the feature value, and update the value of the label with the feature value. Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-05-31 23:30:26 +01:00
Kubernetes Prow Robot	d28a02c5cd	Merge pull request #1222 from vaibhav2107/kustomize-type Fixed typo in Header under deployment/kustomize.md	2023-05-22 00:42:21 -07:00
Kubernetes Prow Robot	70d5ef477f	Merge pull request #1219 from PiotrProkop/leader-elect Add leader election for nfd-master	2023-05-22 00:36:21 -07:00
vaibhav2107	9f7854479f	Fixed type in Header under deployment/kustomize.md	2023-05-18 14:59:54 +05:30
PiotrProkop	272fd4784f	Add new flag enable-leader-election for nfd-master. It allows NFD-master to be run in active-passive way when running multiple instances of NFD-master to prevent multiple components from updating same custom resources. Signed-off-by: PiotrProkop <pprokop@nvidia.com>	2023-05-15 13:30:07 +02:00
Markus Lehtonen	1200fd05c5	topology-updater: use node IP in the default configz URI Use a separate NODE_ADDRESS environment variable in the default value of -kubelet-config-uri (instead of NODE_NAME that was previously used). Also change the kustomize and Helm deployments to set this variable to node IP address. This should make the default deployment more robust, making it work in scenarios where node name does not resolve to the node ip, e.g. nodename != hostname.	2023-05-05 13:29:51 +03:00
Kubernetes Prow Robot	cd45baef8d	Merge pull request #1211 from marquiz/devel/helm deployment/helm: improve handling of topologyUpdater.kubeletStateFiles	2023-05-05 00:17:13 -07:00
Markus Lehtonen	526aab87cf	deployment/helm: user dedicated serviceaccount for topology-updater Change the configuration so that, by default, we use a dedicated serviceaccount for topology-updater (similar to topology-gc, nfd-master and nfd-worker). Fix the templates so that the serviceaccount and clusterrolebinding are only created when topology-updater is enabled (clusterrole was already handled this way). This patch also correctly documents the default value of rbac.create parameter of topology-updater and topology-gc.	2023-05-05 08:30:21 +03:00
Markus Lehtonen	9c2f268fd2	deployment/helm: improve handling of topologyUpdater.kubeletStateFiles Make it possible to disable kubelet state tracking with --set topologyUpdater.kubeletStateFiles="" as the documentation suggests. Also, fix the documentation regarding the default value of topologyUpdater.kubeletStateFiles parameter.	2023-05-04 15:01:19 +03:00
Markus Lehtonen	9685d292a2	docs: add missing .md suffix to internal references Commit `bfbc47f55e` added a lot of those and this patch tries to cover all that we missed there. Having .md suffixes in references to internal files makes it convenient to browse the document locally, just as text files as the references work correctly.	2023-04-25 15:28:07 +03:00
Kubernetes Prow Robot	2356223ffc	Merge pull request #1139 from AhmedGrati/feat-configure-master-resync feat: add master resync period configurability	2023-04-24 03:49:02 -07:00
AhmedGrati	7917434d38	feat: add master resync period configurability This PR adds a config option for setting the NFD API controller resync period. The resync period is only activated when the NodeFeature API has been enabled (with -enable-nodefeature-api). Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-04-24 11:52:38 +02:00

1 2 3 4 5 ...

345 commits