node-feature-discovery

mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00

Author	SHA1	Message	Date
Markus Lehtonen	7d1df87305	source/custom: drop support for the legacy rule format	2023-10-05 16:15:37 +03:00
Markus Lehtonen	1d8a83b045	nfd-master: stop creating NFD version annotations We now have metrics for getting detailed information about the NFD instances running. There should be no need to pollute the node object with NFD version annotations. One problem with the annotations also that they were incomplete in the sense that they only covered nfd-master and nfd-worker but not nfd-topology-updater or nfd-gc. Also, there was a problem with stale annotations, giving misleading information. E.g. there was no way to remove old/stale master.version annotations if nfd-master was scheduled on another node where it was previously running.	2023-10-05 14:53:29 +03:00
AhmedGrati	3130898d58	feat: support raw features Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-10-04 22:37:42 +01:00
Markus Lehtonen	6149000637	Build statically linked binaries Switch to fully statically linked binaries and use scratch as a base image. Switching to the virtually empty scratch base image means that the default/minimal NFD image only supports running hooks that are truly statically linked (e.g. normal go binaries that are "almost" statically linked stop working). The documentation has been already stating this (i.e. that only statically-linked binaries are supported) - i.e. we have had no promise of supporting other than that. Also, hooks are now deprecated and even disabled by default so the possibility of real user impact should be small.	2023-09-19 21:59:18 +03:00
AhmedGrati	6c895b496a	feat: ignore hidden feature files Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-09-11 15:11:31 +01:00
Markus Lehtonen	c126764d7a	cpu: drop the deprecated sgx and se labels Drop the deprecated cpu-sgx.enabled and cpu-se.enabled labels and the corresponding "raw" features. These have been replaced by cpu-security.sgx.enabled and cpu-security.se.enabled.	2023-09-08 14:28:04 +03:00
Kubernetes Prow Robot	2e6a202218	Merge pull request #1331 from andrewjamesbrown/ajb/chart_annotations Helm: conditionally add annotations if defined	2023-09-07 01:20:59 -07:00
Kubernetes Prow Robot	c0c1b89a92	Merge pull request #1334 from ArangoGutierrez/grpc_gone_v2 Deprecate gRPC API	2023-09-07 00:38:59 -07:00
Kubernetes Prow Robot	e097c3f8f6	Merge pull request #1338 from AhmedGrati/feat-add-logging-params-config-file nfd-master: add config file options for logging	2023-09-06 23:58:57 -07:00
Carlos Eduardo Arango Gutierrez	9966d2ae12	Deprecate gRPC API Now that the NodeFeature API has been set enabled by default, the gRPC mode will be deprecated and with it all flags and features around it. For nfd-master, flags -port, -key-file, -ca-file, -cert-file, -verify-node-name, -enable-nodefeature-api are now marked as deprecated. For nfd-worker flags -enable-nodefeature-api, -ca-file, -cert-file, -key-file, -server, -server-name-override are now marked as deprecated. Deprecated flags, as well as gRPC related code will be removed in future releases. Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>	2023-09-07 06:48:15 +02:00
AhmedGrati	a6b4a7d6a9	docs: add docs of logging configuration in nfd master Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-09-06 15:36:15 +01:00
Andrew Brown	a3d26a0404	Add new helm values to documentation	2023-09-06 09:55:25 -04:00
AhmedGrati	124dfbf6df	docs: add notes in the customization guide about the feature file size limit Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-09-06 11:15:45 +01:00
Carlos Eduardo Arango Gutierrez	ade5833ee3	tls.md: Add note (#1332 ) * tls.md: Add note Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> * Update docs/deployment/tls.md Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com> --------- Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>	2023-09-06 01:06:52 -07:00
Kubernetes Prow Robot	50dd128b23	Merge pull request #1329 from ArangoGutierrez/1187 Enable NodeFeature API by default	2023-09-05 11:56:51 -07:00
Carlos Eduardo Arango Gutierrez	04e954a7c3	Enable NodeFeature API by default Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>	2023-09-05 20:21:31 +02:00
Kubernetes Prow Robot	0b218a1eca	Merge pull request #1285 from AhmedGrati/feat-add-expiry-date-feature-files Feat: add expiry date for feature files	2023-09-05 02:19:50 -07:00
Markus Lehtonen	cbd2c2f3df	docs: demote hooks in the customization guide Hooks are deprecated so describe feature files first.	2023-09-04 16:06:51 +03:00
Kubernetes Prow Robot	9848ef9d43	Merge pull request #1321 from ffromani/nfd-topo-updater-fix-docs docs: nfd-updater: clarify accounting	2023-09-04 00:33:49 -07:00
Francesco Romani	727875f240	docs: nfd-updater: clarify accounting Clarify that we account, and we can account, only resources exclusively allocated to Guaranteed QoS pods. Signed-off-by: Francesco Romani <fromani@redhat.com>	2023-09-04 08:51:14 +02:00
AhmedGrati	47aec15ea1	test: add unit tests for the expiration date function Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-09-01 20:04:24 +01:00
Markus Lehtonen	ae1a95f395	docs: update docs build dependencies Add webrick as that is needed. Also update other deps to their latest versions.	2023-08-30 19:31:35 +03:00
Kubernetes Prow Robot	e1f90a233b	Merge pull request #1305 from marquiz/devel/nf-gc Garbage collection of NodeFeature objects	2023-08-28 02:59:42 -07:00
Kubernetes Prow Robot	6d95e59cd0	Merge pull request #1290 from marquiz/devel/metrics-new metrics: additional metrics for nfd-master	2023-08-28 02:07:42 -07:00
Markus Lehtonen	a15b5690b6	docs: update to cover nfd-gc	2023-08-23 10:56:12 +03:00
Markus Lehtonen	ceb672bde0	deployment/helm: support nfd-gc Rename files and parameters. Drop the container security context parameters from the Helm chart. There should be no reason to run the nfd-gc with other than the minimal privileges. Also updates the documentation.	2023-08-23 10:56:12 +03:00
Kubernetes Prow Robot	536f9d17d0	Merge pull request #1295 from marquiz/devel/topology-updater-metrics nfd-topology-updater: add metrics support	2023-08-20 23:25:24 -07:00
Markus Lehtonen	b64ba37377	docs: update github-pages gem to v228 Also update other dependencies.	2023-08-16 13:51:09 +03:00
Markus Lehtonen	5ad2294c14	metrics: add nfd_node_update_requests_total counter Add a counter for total number of node update/sync requests. In practice, this counts the number of gRPC requests received if the gRPC API is in use. If the NodeFeature API is enabled, this counts the requests initiated by the NFD API controller, i.e. updates triggered by changes in NodeFeature or NodeFeatureRule objects plus updates initiated by the controller resync period.	2023-08-07 09:37:29 +03:00
Markus Lehtonen	4b24cc1afa	metrics: counters for rejected labels, extended resources and taints Add counters for labels, extended resources and taints rejected/filtered out by nfd-master.	2023-08-07 09:37:29 +03:00
Markus Lehtonen	a8a29e6df2	metrics: add nfd_nodefeaturerule_processing_errors_total counter Add a counter for errors encountered when processing NodeFeatureRules. Another simple counter without any additional prometheus labels - nfd-master logs can provide further details.	2023-08-07 09:37:29 +03:00
Markus Lehtonen	b90f2c318e	metrics: add nfd_node_update_failures_total counter Add a new counter for tracking node update failures from nfd-master. This tracks both normal feature updates and the --prune sub-command. This is a simple counter without any additional labels - nfd-master logs can be used for further diagnostics.	2023-08-07 09:37:27 +03:00
AhmedGrati	f0edc6532a	docs: add the support of the exipration date in the input format of the feature files Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-08-05 20:39:09 +01:00
Kubernetes Prow Robot	9ed191808d	Merge pull request #1296 from marquiz/docs/metrics docs: document -metrics flag in command line reference	2023-08-05 03:06:30 -07:00
Markus Lehtonen	4b7ee47e5f	docs: document -metrics flag in command line reference Document the -metrics command line flag in the command line reference of nfd-master and nfd-worker.	2023-08-04 16:49:03 +03:00
Markus Lehtonen	06b333db1e	nfd-topology-updater: add metrics support For now, add only one metric, a counter for the errors occurring while scanning pod resources on the node.	2023-08-04 16:48:37 +03:00
Markus Lehtonen	4aa7a8f8f8	source/local: support comments in input Lines starting with '#' are treated as comments and ignored when parsing feature files and hook output.	2023-08-04 16:46:22 +03:00
Markus Lehtonen	0a8b514d67	docs: unify formatting of NOTEs	2023-08-03 15:36:56 +03:00
Markus Lehtonen	a1406767a9	docs: align metrics documentation with latest changes on naming Also change table formatting and fix one incorrect description.	2023-08-01 15:53:06 +03:00
Kubernetes Prow Robot	65b7216313	Merge pull request #1283 from marquiz/docs/deprecation-policy docs: deprecation policy for Helm chart params	2023-07-25 10:46:06 -07:00
Kubernetes Prow Robot	463a737b82	Merge pull request #1277 from marquiz/docs/k8s-compat docs: describe supported Kubernetes versions	2023-07-25 08:54:06 -07:00
Markus Lehtonen	b1328b3166	docs: describe supported Kubernetes versions	2023-07-25 17:40:06 +03:00
Markus Lehtonen	b72b537261	docs: deprecation policy for Helm chart params	2023-07-24 14:06:30 +03:00
Pat Riehecky	0523257d1a	Add optional labels to the podmonitor Signed-off-by: Pat Riehecky <riehecky@fnal.gov>	2023-07-21 10:03:50 -05:00
Kubernetes Prow Robot	c9f3550237	Merge pull request #1280 from marquiz/docs/tocs docs: remove useless TOCs	2023-07-21 06:50:15 -07:00
Kubernetes Prow Robot	ebbea564a8	Merge pull request #1278 from marquiz/docs/fixes docs: fix toc of topology-updater and topology-gc reference	2023-07-21 06:50:08 -07:00
Markus Lehtonen	312ef308d1	docs: remove useless TOCs Drop table of contents from short pages where it is only cluttering the page.	2023-07-21 16:35:12 +03:00
Markus Lehtonen	f825812229	docs: document version and deprecation policy	2023-07-21 16:28:38 +03:00
Markus Lehtonen	d4d6963473	docs: fix toc of topology-updater and topology-gc reference Exclude the main title from to (with the empty line the "no_toc" directive took no effect).	2023-07-21 15:41:59 +03:00
Carlos Eduardo Arango Gutierrez	e3aedd33e2	Enable metrics via prometheus operator Expose metrics via prometheus.monitoring.coreos.com/v1 The exposed metrics are \| Metric \| Type \| Meaning \| \| --------------- \| ---------------- \| ---------------- \| \| `nfd_master_build_info` \| Gauge \| Version from which nfd-master was built. \| \| `nfd_worker_build_info` \| Gauge \| Version from which nfd-worker was built. \| \| `nfd_updated_nodes` \| Counter \| Time taken to label a node \| \| `nfd_crd_processing_time` \| Gauge \| Time taken to process a NodeFeatureRule CRD \| \| `nfd_feature_discovery_duration_seconds` \| HistogramVec \| Time taken to discover features on a node \| Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>	2023-07-21 10:59:52 +02:00
Kubernetes Prow Robot	407a610e0c	Merge pull request #1182 from fmuyassarov/disable-hooks-by-default hooks: disable hooks by default from v0.14	2023-06-22 04:43:40 -07:00
Carlos Eduardo Arango Gutierrez	563cc862de	Docs: Fix typo on customization-guide Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2023-06-09 10:23:33 +02:00
Muyassarov, Feruzjon	19527be924	hooks: disable hooks by default We have deprecated hooks in v0.12.0 but kept it enabled by default. Starting from v0.14 we are starting to disable it by default and plan to fully remove it in the near future. Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2023-06-07 13:04:23 +03:00
Simon Jürgensmeyer	307a865465	Fix missing apostrophe for jq	2023-06-07 09:53:02 +02:00
Hairong Chen	e8a00ba7da	cpu: Discover TDX guests based on cpuid information NFD already has the capability to discover whether baremetal / host machines support Intel TDX. Now, the next step is to add support for discovering whether a node is TDX protected (as in, a virtual machine started using Intel TDX). In order to do so, we've decided to go for a new `cpu-security.tdx` property, called `protected` (`cpu-security.tdx.protected`). Signed-off-by: Hairong Chen <hairong.chen@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-06-05 11:06:28 +02:00
Kubernetes Prow Robot	306969a945	Merge pull request #1133 from AhmedGrati/feat-parallelize-nodes-update feat: parallelize nodes update	2023-06-02 05:28:57 -07:00
AhmedGrati	b3cfe17392	feat: parallelize nodes update This PR aims to optimize the process of updating nodes with corresponding features. In fact, previously, we were updating nodes sequentially even though they are independent from each other. Therefore, we integrated new components: LabelersNodePool which is responsible for spininng a goroutine whenever there's a request for updating nodes, and a Workqueue which is responsible for holding nodes names that should be updated. Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-06-02 11:41:50 +01:00
AhmedGrati	08b9c3486e	feat: support dynamic values for labels in the NodeFeatureRule This PR aims to support the dynamic values for labels in the NodeFeatureRule CRD, it would offer more flexible labeling for users. To achieve this, we check whether label value starts with "@", and if it's the case, we will get the value of the feature value, and update the value of the label with the feature value. Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-05-31 23:30:26 +01:00
Kubernetes Prow Robot	d28a02c5cd	Merge pull request #1222 from vaibhav2107/kustomize-type Fixed typo in Header under deployment/kustomize.md	2023-05-22 00:42:21 -07:00
Kubernetes Prow Robot	70d5ef477f	Merge pull request #1219 from PiotrProkop/leader-elect Add leader election for nfd-master	2023-05-22 00:36:21 -07:00
vaibhav2107	9f7854479f	Fixed type in Header under deployment/kustomize.md	2023-05-18 14:59:54 +05:30
PiotrProkop	272fd4784f	Add new flag enable-leader-election for nfd-master. It allows NFD-master to be run in active-passive way when running multiple instances of NFD-master to prevent multiple components from updating same custom resources. Signed-off-by: PiotrProkop <pprokop@nvidia.com>	2023-05-15 13:30:07 +02:00
Markus Lehtonen	1200fd05c5	topology-updater: use node IP in the default configz URI Use a separate NODE_ADDRESS environment variable in the default value of -kubelet-config-uri (instead of NODE_NAME that was previously used). Also change the kustomize and Helm deployments to set this variable to node IP address. This should make the default deployment more robust, making it work in scenarios where node name does not resolve to the node ip, e.g. nodename != hostname.	2023-05-05 13:29:51 +03:00
Kubernetes Prow Robot	cd45baef8d	Merge pull request #1211 from marquiz/devel/helm deployment/helm: improve handling of topologyUpdater.kubeletStateFiles	2023-05-05 00:17:13 -07:00
Markus Lehtonen	526aab87cf	deployment/helm: user dedicated serviceaccount for topology-updater Change the configuration so that, by default, we use a dedicated serviceaccount for topology-updater (similar to topology-gc, nfd-master and nfd-worker). Fix the templates so that the serviceaccount and clusterrolebinding are only created when topology-updater is enabled (clusterrole was already handled this way). This patch also correctly documents the default value of rbac.create parameter of topology-updater and topology-gc.	2023-05-05 08:30:21 +03:00
Markus Lehtonen	9c2f268fd2	deployment/helm: improve handling of topologyUpdater.kubeletStateFiles Make it possible to disable kubelet state tracking with --set topologyUpdater.kubeletStateFiles="" as the documentation suggests. Also, fix the documentation regarding the default value of topologyUpdater.kubeletStateFiles parameter.	2023-05-04 15:01:19 +03:00
Markus Lehtonen	9685d292a2	docs: add missing .md suffix to internal references Commit `bfbc47f55e` added a lot of those and this patch tries to cover all that we missed there. Having .md suffixes in references to internal files makes it convenient to browse the document locally, just as text files as the references work correctly.	2023-04-25 15:28:07 +03:00
Kubernetes Prow Robot	2356223ffc	Merge pull request #1139 from AhmedGrati/feat-configure-master-resync feat: add master resync period configurability	2023-04-24 03:49:02 -07:00
AhmedGrati	7917434d38	feat: add master resync period configurability This PR adds a config option for setting the NFD API controller resync period. The resync period is only activated when the NodeFeature API has been enabled (with -enable-nodefeature-api). Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-04-24 11:52:38 +02:00
Carlos Eduardo Arango Gutierrez	05ef5d4e9d	cpu: expose the total number of AMD SEV ASID and ES This patch add SEV ASIDs and the related (but distinct) SEV Encrypted State (SEV-ES) IDs as two quantities to be exposed via extended resources. In a kernel built with CONFIG_CGROUP_MISC on a suitably equipped AMD CPU, the root control group will have a misc.capacity file that shows the number of available IDs in each category. The added extended resources are: - sev.asids - sev.encrypted_state_ids Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2023-04-17 19:34:39 +02:00
Mikko Ylinen	de1b69a8bf	cpu: make SGX EPC resource available to NodeFeatureRules Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2023-04-14 15:31:54 +03:00
Markus Lehtonen	3320c74472	source/cpu: don't create cpu-security.tdx.total_keys label Just have that as a feature for NodeFeatureRules to consume.	2023-04-14 13:33:13 +03:00
Kubernetes Prow Robot	84c348b69f	Merge pull request #1126 from marquiz/devel/er-deprecation nfd-master: deprecate the -resource-labels flag	2023-04-13 10:52:39 -07:00
Kubernetes Prow Robot	8d71ed6755	Merge pull request #1086 from AhmedGrati/feat-support-builtin-kernel-mods feat: support builtin kernel mods	2023-04-13 10:30:40 -07:00
AhmedGrati	109caa1f28	feat: support builtin kernel mods This PR adds the combination of dynamic and builtin kernel modules into one feature called `kernel.enabledmodule`. It's a superset of the `kernel.loadedmodule` feature. Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-04-13 10:19:24 +01:00
Markus Lehtonen	8511980bf4	nfd-master: deprecate the -resource-labels flag Mark the -resource-labels flag (and the corresponding resourceLabels config option) as deprecated. We now support managing extended resources via NodeFeatureRule objects. This kludge deserves to go, eventually.	2023-04-13 11:30:58 +03:00
Markus Lehtonen	dcbb3bc450	docs: add missing mentions of extended resources and taints A small update to fix some missing mentions of extended resources and taints as assets managed by NFD.	2023-04-11 20:38:21 +03:00
Kubernetes Prow Robot	ad07829d0a	Merge pull request #1099 from ArangoGutierrez/extended_resources_v2 Create extended resources with NodeFeatureRule	2023-04-07 08:09:15 -07:00
Fabiano Fidêncio	250aea4741	Create extended resources with NodeFeatureRule Add support for management of Extended Resources via the NodeFeatureRule CRD API. There are usage scenarios where users want to advertise features as extended resources instead of labels (or annotations). This patch enables the discovery of extended resources, via annotation and patch of node.status.capacity and node.status.allocatable. By using the NodeFeatureRule API. Co-authored-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com> Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2023-04-07 16:14:56 +02:00
Kubernetes Prow Robot	6740224a13	Merge pull request #1100 from PiotrProkop/expose-L3-num-closid Advertise RDT L3 num_closid	2023-04-07 00:49:14 -07:00
Markus Lehtonen	cc6c20ff5f	nfd-master: disallow unprefixed and kubernetes taints Disallow taints having a key with "kubernetes.io/" or "*.kubernetes.io/" prefix. This is a precaution to protect the user from messing up with the "official" well-known taints from Kubernetes itself. The only exception is that the "nfd.node.kubernetes.io/" prefix is allowed. However, there is one allowed NFD-specific namespace (and its sub-namespaces) i.e. "feature.node.kubernetes.io" under the kubernetes.io domain that can be used for NFD-managed taints. Also disallow unprefixed taint keys. We don't add a default prefix to unprefixed taints (like we do for labels) from NodeFeatureRules. This is to prevent unpleasant surprises to users that need to manage matching tolerations for their workloads.	2023-04-06 16:12:37 +03:00
PiotrProkop	0e78eba40e	Advertise RDT L3 num_closid Signed-off-by: PiotrProkop <pprokop@nvidia.com>	2023-04-06 11:22:55 +02:00
Kubernetes Prow Robot	3c0c43b9be	Merge pull request #1114 from marquiz/devel/rdt-deprecate source/cpu: deprecate cpu-rdt.* labels	2023-04-05 06:21:40 -07:00
Kubernetes Prow Robot	193c552b33	Merge pull request #1084 from AhmedGrati/feat-add-master-config-file feat: add master config file	2023-04-04 10:41:40 -07:00
Markus Lehtonen	6cb5e99afa	source/cpu: deprecate cpu-rdt.* labels Document built-in RDT labels to be deprecated and removed in a future release. The plan is that the default built-in RDT labels would not be created anymore, but the RDT features would still be available for NodeFeatureRules to consume. The RDT labels are not very useful (they don't e.g indicate if the features are really enabled in kernel or if the resctrlfs is mounted).	2023-04-04 11:54:57 +03:00
AhmedGrati	3fff409f6d	Add master config file Similar to the nfd-worker, in this PR we want to support the dynamic run-time configurability through a config file for the nfd-master. We'll use a json or yaml configuration file along with the fsnotify in order to watch for changes in the config file. As a result, we're allowing dynamic control of logging params, allowed namespaces, extended resources, label whitelisting, and denied namespaces. Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-04-03 09:52:09 +01:00
Fabiano Fidêncio	10672e1bba	cpu: Expose the total number of keys for TDX The total amount of keys that can be used on a specific TDX system is exposed via the cgroups misc.capacity. See: ``` $ cat /sys/fs/cgroup/misc.capacity tdx 31 ``` The first step to properly manage the amount of keys present in a node is exposing it via the NFD, and that's exactly what this commit does. An example of how it ends up being exposed via the NFD: ``` $ kubectl get node 984fee00befb.jf.intel.com -o jsonpath='{.metadata.labels}' \| jq \| grep tdx.total_keys "feature.node.kubernetes.io/cpu-security.tdx.total_keys": "31", ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-03-31 09:12:26 +02:00
Carlos Eduardo Arango Gutierrez	7171cfd4eb	cpu: expose AMD SEV support Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>	2023-03-30 15:19:43 +02:00
AhmedGrati	02b3b7c7e0	feat: add enableTaints to helm chart Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-03-21 10:49:24 +01:00
Talor Itzhak	5c6be580f4	reactive updates: add an option to disable the feature Access to the kubelet state directory may raise concerns in some setups, added an option to disable it. The feature is enabled by default. Signed-off-by: Talor Itzhak <titzhak@redhat.com>	2023-03-16 11:53:16 +02:00
Talor Itzhak	727de56191	documentaion: document the reactive updates feature Signed-off-by: Talor Itzhak <titzhak@redhat.com>	2023-03-16 11:53:12 +02:00
Talor Itzhak	8924213d14	topology-updater: make it possible to disable sleep-interval Especially convenient for testing porpuses and completely harmless Signed-off-by: Talor Itzhak <titzhak@redhat.com>	2023-03-12 12:43:17 +02:00
Sajiyah Salat	7082c31d6c	Update worker-configuration-reference.md	2023-03-08 21:33:44 +05:30
Sajiyah Salat	fb2d70a313	Update worker-configuration-reference.md	2023-03-08 21:28:45 +05:30
AhmedGrati	ff2dddd27d	docs: fix usage cusomization guide typos Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-02-27 10:26:25 +01:00
Jose Luis Ojosnegros Manchón	b340d112a8	topology-updater:compute pod set fingerprint Add an option to compute the fingerprint of the current pod set on each node. Report this new fingerprint using an attribute in NRT object.	2023-02-22 10:22:50 +01:00
Kubernetes Prow Robot	69440d7820	Merge pull request #1062 from yanggangtony/fix-doc docs: describe nfd-topology-gc in introduction.md	2023-02-21 02:17:48 -08:00
Muyassarov, Feruzjon	0e2f2c4587	go.mod: bump cpuid to v2.2.4 Bump cpuid version to v2.2.4 in the go.mod so that WRMSRNS ( Non-Serializing Write to Model Specific Register) and MSRLIST (Read/Write List of Model Specific Registers) instructions are detectable. Signed-off-by: Muyassarov, Feruzjon <feruzjon.muyassarov@intel.com>	2023-02-20 22:58:59 +02:00
yanggang	150d4f4db2	docs: describe nfd-topology-gc in introduction.md Signed-off-by: yanggang <gang.yang@daocloud.io>	2023-02-18 06:12:35 +08:00
Guangwen Feng	8ad6c5b425	Fix some typos Signed-off-by: Guangwen Feng <fenggw-fnst@fujitsu.com>	2023-02-16 22:08:00 +08:00
Kubernetes Prow Robot	a92614c292	Merge pull request #1051 from AhmedGrati/feat-add-deny-label-ns-with-wildcard feat: add deny-label-ns flag which supports wildcard	2023-02-15 03:42:25 -08:00
AhmedGrati	b499799364	feat: add deny-label-ns flag which supports wildcard Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-02-15 09:47:00 +01:00
Kubernetes Prow Robot	e3b9184354	Merge pull request #1027 from marquiz/devel/image-full images: base the default image on distroless/base	2023-02-10 08:07:30 -08:00
AhmedGrati	07d5ffe4b8	helm: make master port configurable Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-02-01 10:03:06 +01:00
Markus Lehtonen	cd62f6566f	images: base the default image on distroless/base Make distroless/base as the base image for the default image, effectively making the minimal image as the default. Add a new "full" image variant that corresponds the previous default image. The "*-minimal" container image tag is provided for backwards compatibility. The practical user impact of this change is that hook support is limited to statically linked ELF binaries. Bash or Perl scripts are not supported by the default image, anymore, but the new "full" image variant can be used for backwards compatibility.	2023-01-31 11:30:38 +02:00
Chandan Abhyankar	d66096a491	cpu: support for detecting nx-gzip coprocessor feature Nest accelerator gzip support for IBM Power systems. Signed-off-by: Chandan Abhyankar <Chandan.Abhyankar@ibm.com>	2023-01-17 23:18:16 -08:00
Hiren Panchasara	bfbc47f55e	docs: fix internal cross-page references by injecting .md	2023-01-16 20:53:36 -08:00
PiotrProkop	3143faf0ab	Add documentation for topology garbage collector Signed-off-by: PiotrProkop <pprokop@nvidia.com>	2023-01-11 10:15:38 +01:00
Kubernetes Prow Robot	0159ab04e7	Merge pull request #1021 from fmuyassarov/docs-taint Docs: mention tainting in the intro section	2023-01-02 02:19:30 -08:00
Kubernetes Prow Robot	79cd4fc094	Merge pull request #1023 from fmuyassarov/sfr-support Bump cpuid to v2.2.3	2023-01-02 01:27:31 -08:00
Muyassarov, Feruzjon	d9dc4b09d5	Bump cpuid to v2.2.3 Bump cpuid to v2.2.3 which adds support for detecting Intel Sierra Forest instructions like AVXIFMA, AVXNECONVERT, AVXVNNIINT8 and CMPCCXADD. Signed-off-by: Muyassarov, Feruzjon <feruzjon.muyassarov@intel.com>	2022-12-30 11:42:05 +02:00
Muyassarov, Feruzjon	842153a907	Docs: mention tainting in the intro section Signed-off-by: Muyassarov, Feruzjon <feruzjon.muyassarov@intel.com>	2022-12-28 14:00:04 +02:00
Markus Lehtonen	8c0e38d0c5	docs: fix typo in CRD name	2022-12-21 13:42:10 +02:00
Markus Lehtonen	b91922746a	docs: mention NodeFeature as an extension point In the CRD intro, mention that NodeFeature can be used as an integration point for 3rd party extensions.	2022-12-21 13:26:31 +02:00
Markus Lehtonen	27c47bd088	docs: better document differences between deployment methods	2022-12-20 16:29:48 +02:00
Markus Lehtonen	3209c14bea	docs: document NodeFeature API Document the usage of the NodeFeature CRD API. Also re-organize the documentation a bit, moving the description of NodeFeatureRule controller from customization guide to nfd-master usage page.	2022-12-14 22:33:12 +02:00
Markus Lehtonen	9f0806593d	nfd-master: rename -featurerules-controller flag to -crd-controller Deprecate the '-featurerules-controller' command line flag as the name does not describe the functionality anymore: in practice it controls the CRD controller handling both NodeFeature and NodeFeatureRule objects. The patch introduces a duplicate, more generally named, flag '-crd-controller'. A warning is printed in the log if '-featurerules-controller' flag is encountered.	2022-12-14 10:23:45 +02:00
Markus Lehtonen	5a717c418b	docs: small reordering of master cmdline reference Move documentation of -enable-taints near '-enable-nodefeature-api' and '-no-publish' as they are related in that they control the enablement of APIs.	2022-12-14 07:31:28 +02:00
Markus Lehtonen	6ddd87e465	nfd-master: support NodeFeature objects Add initial support for handling NodeFeature objects. With this patch nfd-master watches NodeFeature objects in all namespaces and reacts to changes in any of these. The node which a certain NodeFeature object affects is determined by the "nfd.node.kubernetes.io/node-name" annotation of the object. When a NodeFeature object targeting certain node is changed, nfd-master needs to process all other objects targeting the same node, too, because there may be dependencies between them. Add a new command line flag for selecting between gRPC and NodeFeature CRD API as the source of feature requests. Enabling NodeFeature API disables the gRPC interface. -enable-nodefeature-api enable NodeFeature CRD API for incoming feature requests, will disable the gRPC interface (defaults to false) It is not possible to serve gRPC and watch NodeFeature objects at the same time. This is deliberate to avoid labeling races e.g. by nfd-worker sending gRPC requests but NodeFeature objects in the cluster "overriding" those changes (labels from the gRPC requests will get overridden when NodeFeature objects are processed).	2022-12-14 07:31:28 +02:00
Markus Lehtonen	237494463b	nfd-worker: support creating NodeFeatures object Support the new NodeFeatures object of the NFD CRD api. Add two new command line options to nfd-worker: -kubeconfig specifies the kubeconfig to use for connecting k8s api (defaults to empty which implies in-cluster config) -enable-nodefeature-api enable the NodeFeature CRD API for communicating node features to nfd-master, will also automatically disable gRPC (defgault to false) No config file option for selecting the API is available as there should be no need for dynamically selecting between gRPC and CRD. The nfd-master configuration must be changed in tandem and it is safer (and avoid awkward configuration races) to configure the whole NFD deployment at once. Default behavior of nfd-worker is not changed i.e. NodeFeatures object creation is not enabled by default (but must be enabled with the command line flag). The patch also updates the kustomize and Helm deployment, adding RBAC rules for nfd-worker and updating the example worker configuration.	2022-12-14 07:31:28 +02:00
Kubernetes Prow Robot	776a8c335c	Merge pull request #980 from marquiz/devel/topology-updater nfd-topology-updater: update NodeResourceTopology objects directly	2022-12-08 01:44:22 -08:00
Markus Lehtonen	f13ed2d91c	nfd-topology-updater: update NodeResourceTopology objects directly Drop the gRPC communication to nfd-master and connect to the Kubernetes API server directly when updating NodeResourceTopology objects. Topology-updater already has connection to the API server for listing Pods so this is not that dramatic change. It also simplifies the code a lot as there is no need for the NFD gRPC client and no need for managing TLS certs/keys. This change aligns nfd-topology-updater with the future direction of nfd-worker where the gRPC API is being dropped and replaced by a CRD-based API. This patch also update deployment files and documentation to reflect this change.	2022-12-08 11:03:22 +02:00
Markus Lehtonen	881ee13654	docs: remove non-existent nodeFeatureRule.createCRD parameter This value was recently dropped.	2022-12-07 16:25:43 +02:00
Markus Lehtonen	0834ec5cbf	go.mod: update to klauspost/cpuid to v2.2.2 Support detection of Intel TME (Total Memory Encryption) plus AMXFP16 and PREFETCHI.	2022-12-07 13:58:19 +02:00
Feruzjon Muyassarov	984a3de198	Document tainting feature Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-12-02 17:29:10 +02:00
Kubernetes Prow Robot	f740f084e0	Merge pull request #976 from marquiz/docs/customization-guide docs: small update to customization guide	2022-12-01 12:51:55 -08:00
Markus Lehtonen	72e523f277	scripts/mdlint: update mdlint to v0.12.0	2022-12-01 20:57:21 +02:00
Markus Lehtonen	32b252147c	docs: small update to customization guide Add a reference to the label rule format in the NodeFeatureRule section. Also make it explicit in the beginning of Hooks section that hooks are deprecated.	2022-12-01 18:33:48 +02:00
Markus Lehtonen	8a45384037	docs: simplify quick-start page Move topology-updater deployment notes to the topology-updater usage page. Also, rework the plaintext and headings a bit.	2022-12-01 12:22:23 +02:00
Markus Lehtonen	cdc7558f6f	docs: better document custom resources Add a separate page for describing the custom resources used by NFD. Simplify the Introduction page by moving the details of NodeResourceTopology from there. Similarly, drop long NodeResourceTopology example from the quick-start page, making the page shorter and simpler.	2022-12-01 11:12:59 +02:00
Kubernetes Prow Robot	efc833d1c7	Merge pull request #970 from marquiz/docs/worker-helm-sa-params docs: document helm chart params related to worker serviceaccount	2022-11-28 08:36:08 -08:00
Markus Lehtonen	d0a4cf7564	docs: document helm chart params related to worker serviceaccount	2022-11-28 18:07:17 +02:00
Markus Lehtonen	c1fa8b2f28	docs: revise topology-updater helm chart rbac parameters	2022-11-28 17:49:19 +02:00
Markus Lehtonen	eb8e29c80a	nfd-worker: drop deprecated command line flags Drop the following flags that were deprecated already in v0.8.0: -sleep-interval (replaced by core.sleepInterval config file option) -label-whitelist (replaced by core.labelWhiteList config file option) -sources (replaced by -label-sources flag)	2022-11-23 22:33:51 +02:00
Talor Itzhak	d495376f06	docs: topology-updater: update docs for exclude-list feature Update the docs with explanations and examples about the exclude-list feature. Signed-off-by: Talor Itzhak <titzhak@redhat.com>	2022-11-21 21:31:51 +02:00
Markus Lehtonen	6f49421c0e	docs: update github-pages gem to v227	2022-11-16 21:08:13 +02:00
Garrybest	3ec1b94020	get kubelet config from configz Signed-off-by: Garrybest <garrybest@foxmail.com>	2022-11-08 23:52:35 +08:00
Markus Lehtonen	6171c745a4	docs: restructure docs Introduce two main sections "Deployment" and "Usage" and move "Developer guide" to the top level, too. In particular, split the huge deployment-and-usage file into multiple parts under the new main sections. Move customization guide from "Advanced" to "Usage". This patch also renames "Advanced" to "Reference" as only that is left there is reference documentation.	2022-11-03 10:26:56 +02:00
Markus Lehtonen	3a279ce751	docs: update the name of the base image	2022-11-02 15:10:46 +02:00
Kubernetes Prow Robot	e5c8180558	Merge pull request #937 from pacoxu/master Stop using the beta.kubernetes.io/os and arch labels	2022-10-27 05:36:32 -07:00
Paco Xu	4e12ed8aac	Stop using the beta.kubernetes.io/os and arch labels	2022-10-27 11:03:14 +08:00
Fabiano Fidêncio	d5db1cf907	cpu: Discover Intel TDX Set `cpu-security.tdx.enable` to `true` when TDX is avialable and has been enabled. otherwise it'll be set to `false`. `/sys/module/kvm_intel/parameters/tdx` presence and content is used to detect whether a CPU is Intel TDX capable. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-10-03 09:56:24 +02:00
Kubernetes Prow Robot	8662d17530	Merge pull request #871 from fmuyassarov/disable-hook Config option to disable hooks	2022-09-26 10:40:08 -07:00
Markus Lehtonen	db7dd93a64	docs: fix incorrect shell snippet for removing labels	2022-09-15 16:18:09 +03:00
Markus Lehtonen	f21315d85f	Update kubernetes registry to registry.k8s.io Update registry location for non-nfd images.	2022-09-12 11:23:04 +03:00
Markus Lehtonen	4f34451db8	Update NFD registry to registry.k8s.io Kubernetes has moved to a new container image registry: https://groups.google.com/a/kubernetes.io/g/dev/c/DYZYNQ_A6_c/m/FpHqeVR2BAAJ	2022-09-12 11:21:12 +03:00
Kubernetes Prow Robot	77af16fe9d	Merge pull request #880 from fmuyassarov/add-tiltfile/feruz Add Tilt option for developing NFD	2022-09-06 12:06:23 -07:00
Kubernetes Prow Robot	81da164b7f	Merge pull request #833 from marquiz/devel/security-refactor cpu: re-organize security features	2022-09-01 05:29:06 -07:00
Feruzjon Muyassarov	e7af8d068f	Update documentation about hooks depreciation Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-09-01 10:58:35 +03:00
Feruzjon Muyassarov	a675fd93fd	Don't advertise BASE_IMAGE_FULL and BASE_IMAGE_MINIMAL Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-08-30 17:37:01 +03:00

1 2 3 4 5 ...

414 commits