node-feature-discovery

mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00

Author	SHA1	Message	Date
Markus Lehtonen	649036977e	nfd-worker: improved log when creating NodeFeature object Don't log an empty NodeFeature object.	2024-05-23 14:37:26 +03:00
Markus Lehtonen	560bd11d85	Re-add -enable-nodefeature-api cmdline flag Bring back the -enable-nodefeature-api command line flag and the corresponding enableNodeFeatureApi helm config value that were removed without deprecation when the NodeFeatureAPI feature gate was introduced. The thinking behind this change is to not break existing users (without warning) unless totally unavoidable. Now the -enable-nodefeature-api flag is marked as deprecated and slated for removal in NFD v0.17. The NodeFeatureAPI feature gate and the -enable-nodefeature-api flag work together so that the NodeFeature API is disabled (gRPC is enabled, instead) if either of them is set to false. This patch selectively reverts parts of `06c4733bc5`.	2024-05-16 10:53:49 +03:00
Kubernetes Prow Robot	6b80f654d4	Merge pull request #1600 from ArangoGutierrez/e2e-not-k8s Move NFD api to a separate go mod	2024-04-09 02:06:06 -07:00
Carlos Eduardo Arango Gutierrez	3434557d7c	Move NFD api to a separate go mod Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2024-04-05 16:35:47 +02:00
Markus Lehtonen	26a80cf142	Tidy up usage of channels for signaling This started as a small effort to simplify the usage of "ready" channel in nfd-master. It extended into a wider simplification/unification of the channel usage.	2024-04-05 14:39:58 +03:00
Oleg Zhurakivskyy	8b63d17af7	nfd-worker: Add liveness probe Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>	2024-03-19 15:34:53 +02:00
Carlos Eduardo Arango Gutierrez	06c4733bc5	Add FeatureGate framework to handle new features Code inspired on https://github.com/kubernetes/component-base/tree/master/featuregate Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2024-03-15 19:11:32 +01:00
Carlos Eduardo Arango Gutierrez	69dbfdfbc0	Use close to signal stop channedl in worker and topology-updater Fix stop channel management on Worker and T-updater in case of multiple callers Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2024-03-14 15:28:39 +01:00
Markus Lehtonen	acf815fb10	pkg/utils: move GetKubeconfig from pkg/apihelper here This change is part of an effort to remove the pkg/apihelper package. GetKubeconfig is useful helper functionality shared accross the codebase so move it into a "safe" location.	2024-01-24 16:10:02 +02:00
Gyuho Lee	ed0418b81c	chore(nfd-worker): fix minor typo in wrong label value format error Signed-off-by: Gyuho Lee <gyuho@lepton.ai>	2023-12-19 02:29:37 +08:00
Markus Lehtonen	cb0a46ec0e	Use generics for maps and slices	2023-12-13 12:09:53 +02:00
Markus Lehtonen	34574f4211	nfd-worker: set owner reference in NodeFeature objects This patch creates a owner-dependent relationship between the nfd-worker pod and the NodeFeature object that it creates. With this change the orphaned NodeFeature object(s) gets automatically garbage-collected when the nfd-worker pod goes away, without the need for manual clean-up actions.	2023-12-08 14:57:31 +02:00
Markus Lehtonen	f266533a7d	nfd-worker: fix typo in log message	2023-11-24 17:17:42 +02:00
Markus Lehtonen	1d012a28cd	Option to stop implicitly adding default prefix to names Add new autoDefaultNs (default is "true") config option to nfd-master. Setting the config option to false stops NFD from automatically adding the "feature.node.kubernetes.io/" prefix to labels, annotations and extended resources. Taints are not affected as for them no prefix is automatically added. The user-visible part of enabling the option change is that NodeFeatureRules, local feature files, hooks and configuration of the "custom" may need to be altereda (if the auto-prefixing is relied on). For now, the config option defaults to "true", meaning no change in default behavior. However, the intent is to change the default to "false" in a future release, deprecating the option and eventually removing it (forcing it to "false"). The goal of stopping doing "auto-prefixing" is to simplify the operation (of nfd and users). Make the naming more straightforward and easier to understand and debug (kind of WYSIWYG), eliminating peculiar corner cases: 1. Make validation simpler and unambiguous 2. Remove "overloading" of names, i.e. the mapping two values to the same actual name. E.g. previously something like labels: feature.node.kubernetes.io/foo: bar foo: baz Could actually result in node label: feature.node.kubernetes.io/foo: baz 3. Make the processing/usagee of the "rule.matched" and "local.labels" feature in NodeFeatureRules unambiguous and more understadable. E.g. previously you could have node label "feature.node.kubernetes.io/local-foo: bar" but in the NodeFeatureRule you'd need to use the unprefixed name "local-foo" or the fully prefixed name, depending on what was specified in the feature file (or hook) on the node(s). NOTE: setting autoDefaultNs to false is a breaking change for users who rely on automatic prefixing with the default feature.node.kubernetes.io/ namespace. NodeFeatureRules, feature files, hooks and custom rules (configuration of the "custom" source of nfd-worker) will need to be altered. Unprefixed labels, annoations and extended resources will be denied by nfd-master.	2023-11-24 12:48:20 +02:00
Markus Lehtonen	5171ae0f90	Refactor metrics Move common boilerplate code under pkg/utils.	2023-10-09 10:49:12 +03:00
AhmedGrati	7ab6314bdc	chore: introduce a commong klog handling for cmd/nfd-* Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-09-07 22:38:15 +01:00
Carlos Eduardo Arango Gutierrez	e3aedd33e2	Enable metrics via prometheus operator Expose metrics via prometheus.monitoring.coreos.com/v1 The exposed metrics are \| Metric \| Type \| Meaning \| \| --------------- \| ---------------- \| ---------------- \| \| `nfd_master_build_info` \| Gauge \| Version from which nfd-master was built. \| \| `nfd_worker_build_info` \| Gauge \| Version from which nfd-worker was built. \| \| `nfd_updated_nodes` \| Counter \| Time taken to label a node \| \| `nfd_crd_processing_time` \| Gauge \| Time taken to process a NodeFeatureRule CRD \| \| `nfd_feature_discovery_duration_seconds` \| HistogramVec \| Time taken to discover features on a node \| Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>	2023-07-21 10:59:52 +02:00
Markus Lehtonen	bf670de68d	pkg/utils: migrate KlogDump to structured logging Drop the KlogDump helper in favor of klog.InfoS. However, that patch introduces a new DelayedDumper() helper to avoid processing (marshalling) of object unless really evaluated by the logging function.	2023-05-31 14:43:08 +03:00
Markus Lehtonen	7be08f9e7f	nfd-worker: migrate to structured logging	2023-05-31 14:43:08 +03:00
AhmedGrati	7917434d38	feat: add master resync period configurability This PR adds a config option for setting the NFD API controller resync period. The resync period is only activated when the NodeFeature API has been enabled (with -enable-nodefeature-api). Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-04-24 11:52:38 +02:00
AhmedGrati	d0a6289c0f	chore: add debug dump of nfd worker configuration Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-03-18 00:49:07 +01:00
Ville Pihlava	b1c6b229fe	Add discovery duration logging.	2023-02-13 12:55:57 +02:00
Ville Pihlava	2101cb20e4	Change nfd-worker to use Ticker instead of After.	2023-02-09 17:14:39 +02:00
Markus Lehtonen	1026d91d12	worker: move code Simplify code bu dropping the unnecessary base client package.	2022-12-23 11:38:21 +02:00
Markus Lehtonen	112744bc50	nfd-worker: split out gRPC connection handling Refactor the worker code and split out gRPC client connection handling into a separate base type. The intent is to promote re-usability of code for other NFD clients, too.	2021-08-20 15:29:27 +03:00
Kubernetes Prow Robot	c0e1000a7d	Merge pull request #474 from marquiz/devel/worker-log-verbosity nfd-worker: don't log labels returned by sources by default	2021-03-15 12:52:34 -07:00
Markus Lehtonen	6c6249a599	nfd-worker: don't log labels returned by sources by default Reduce default log verbosity. Only print out labels if log verbosity is 1 or higher ('core.klog.v: 1' config file option or '-v 1' on command line). Also, dump the labels in a reproducible (sorted) format.	2021-03-15 21:42:33 +02:00
Markus Lehtonen	2d20a2ff7c	nfd-worker: support certificate rotation Watch for changes in TLS files and re-connect to nfd-master in the event of changes.	2021-03-09 14:40:51 +02:00
Markus Lehtonen	dfc2596a22	pkg/utils: generalize file watcher Add the capability to watch multiple files. Move it to a separate package in order to make it reusable.	2021-03-09 14:20:34 +02:00
Markus Lehtonen	dd7691c486	nfd-worker: improve log messages of config handling	2021-03-02 18:49:58 +02:00
Carlos Eduardo Arango Gutierrez	389a8f87cf	logging: start log messages with lower case Standarize logs to be lower case. Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>	2021-03-01 10:07:21 -05:00
Markus Lehtonen	5e6f0779e9	nfd-worker: stop masking crashes in feature discovery The code should be stable enough. If there are fatal bugs causing the discovery to panic/segfault that should be made visible instead of semi-siently hiding it. Also, this caused one (negative) test case to fail undetected which is now fixed.	2021-03-01 09:14:19 +02:00
Markus Lehtonen	3f18e880b4	nfd-worker: dynamic configuration of klog Make it possible to dynamically (at run-time) alter most of the logging configuration from the config file.	2021-02-25 16:10:43 +02:00
Markus Lehtonen	7da7fde8f6	nfd-worker: switch to klog Greatly expands logging capabilities and flexibility with verbosity options, among other things.	2021-02-25 16:10:43 +02:00
Markus Lehtonen	3fd61eacdb	nfd-worker: switch to flag in command line parsing	2021-02-24 12:06:16 +02:00
Markus Lehtonen	2b24ed2c18	nfd-worker: implement Stop() method	2021-02-17 21:50:58 +02:00
Markus Lehtonen	c2c9dff724	nfd-worker: bail out on invalid config file Changes the behaviour so that if the specified configuration file exists it must be valid. Error out at startup if the config is invalid. Similarly, exit with an error at runtime if the config file becomes invalid. Bailing out, instead of just printing an error, was a deliberate choice in order to make configuration mistakes evident. Having no configuration file is tolerated, however. If the specified configuration file does not exists nfd-worker resorts to default settings.	2021-02-17 21:42:50 +02:00
Markus Lehtonen	7e88f00e05	nfd-worker: add core.sources config option Add a config file option for controlling the enabled feature sources, aimed at replacing the --sources command line flag which is now marked as deprecated. The command line flag takes precedence over the config file option.	2021-02-17 21:36:20 +02:00
Markus Lehtonen	ed177350fc	nfd-worker: add core.labelWhiteList config option Add a config file option for label whitelisting. Deprecate the --label-whitelist command line flag. Note that the command line flag has higher priority than the config file option.	2021-02-17 21:35:44 +02:00
Markus Lehtonen	d1d8de944e	nfd-worker: add core.sleepInterval config option Add a new config file option for (dynamically) controlling the sleep interval. At the same time, deprecate the --sleep-interval command line flag. The command line flag takes precedence over the config file option.	2021-02-17 21:35:13 +02:00
Markus Lehtonen	e6bdc17d8c	nfd-worker: add core config Allows dynamic (re-)configuration of most nfd-worker options. The goal is to have most configuration parameters specified in the configuration file and deprecate most of the command line flags. The priority is intended to be such that command line flags override whatever is specified in the configuration file. Thus, specifying something on the command line effectively disables dynamic configurability of that parameter. This patch adds core.noPublish config file option to demonstrate how the new mechanism is supposed to work. The --no-publish command line flag takes precedence over this config file option.	2021-02-17 21:35:12 +02:00
Markus Lehtonen	29910464a0	nfd-worker: always re-label after a re-config event Always do re-discovery and re-labeling after a configuration file change. his way the new config comes into effect immediately, even if the sleep interval is long (or infinite) # Please enter the commit message for your changes. Lines starting	2021-02-10 22:09:27 +02:00
Markus Lehtonen	b6ff514853	nfd-worker: use fsnotify for watching for config file changes Add support for detecting configuration file changes via file system notifications (fsnotify). Watches are added for the whole directory chain (up to root directory) so that all changes (even directory renames) affecting the given configuration file path are captured. Previously dynamic (re-)configuration of nfd-worker was implemented by (re-)reading the configuration file on every labeling pass. This was simple and effective, even if a bit wasteful. However, it didn't provide asynchronous configuration updates that will be required for e.g. controlling the "sleep-interval" parameter dynamically which will be implemented by later patches.	2021-02-10 22:09:27 +02:00
Markus Lehtonen	6958a6677f	nfd-worker: use timer channel for sleep interval	2021-02-10 22:09:27 +02:00
Markus Lehtonen	29cbb2429c	nfd-worker: add special handling for --sources=all A new special value 'all' is a shortcut for enabling all feature sources. It should be the only name specified -- if any other names are specified 'all' does not take effect, but, we only enable the listed feature sources. E.g. --sources=all enables all sources, but --sources=all,cpu only enables the cpu source Also, print a warning if unknown sources are specified.	2020-11-20 16:23:53 +02:00
Markus Lehtonen	9e813a559c	nfd-worker: reload config on each re-discovery pass Dumb re-read/re-parse of the configuration file on every round of discoery. Probably not the most elegant solution to watch for config file changes, but, it works and doesn't cost much overhead.	2020-05-21 00:59:39 +03:00
Markus Lehtonen	a2b9df5cd3	nfd-worker: rework configuration handling Extend the FeatureSource interface with new methods for configuration handling. This enables easier on-the fly reconfiguration of the feature sources. Further, it simplifies adding config support to feature sources in the future. Stub methods are added to sources that do not currently have any configurability. The patch fixes some (corner) cases with the overrides (--options) handling, too: - Overrides were not applied if config file was missing or its parsing failed - Overrides for a certain source did not have effect if an empty config for the source was specified in the config file. This was caused by the first pass of parsing (config file) setting a nil pointer to the source-specific config, effectively detaching it from the main config. The second pass would then create a new instance of the source specific config, but, this was not visible in the feature source, of course.	2020-05-21 00:59:37 +03:00
Markus Lehtonen	c95ad3198c	nfd-worker: refactor handling of enabled sources and labels Make the list of enabled sources and the label whitelist regexp members of the nfdWorker instance. Get rid of the not-that-well-defined configureParameters() function.	2020-05-21 00:48:21 +03:00
Markus Lehtonen	818fc4cc70	nfd-worker: fix --label-whitelist Unify handling of --label-whitelist in nfd-worker and nfd-master. That is, in nfd-worker, apply the regexp filter on non-namespaced part of the label name. Brief history: 1. Originally the whitelist regexp was applied on the full namespaced label name (that would be e.g. 'feature.node.kubernetes.io/cpu-cpuid.AVX' in the current nfd version) 2. Commit `81752b2d` changed the behavior so that the regexp was applied on the non-namespaced part (that would be `cpu-cpuid.AVX`) 3. Commit `40918827` added support for custom label namespaces. With this change, the label whitelist handling diverged between nfd-worker and nfd-master. In nfd-master the whitelist regexp is always applied on the non-namespaced label name. However, in nfd-worker the whitelist handling is two-fold (and inconsistent): for labels in the standard nfd namespace regexp is applied on the non-namespaced part (e.g. `cpu-cpuid.AVX`, but, for labels in custom namespaces the regexp is applied on the full name (e.g. `example.com/my-feature`). This patch changes nfd-worker to behave similarly to nfd-master. The namespace part is now always omitted, which should be easier for the users to comprehend. Also, fixes a bug in the label name prefixing so that the name of the feature source is not prefixed into labels with custom label namespace (effectively mangling the intended namespace). For example, previously a 'example.com/feature' label from the 'custom' feature source would be prefixed with the source name, mangling it to 'custom-example.com/feature'.	2020-05-20 23:07:13 +03:00
Markus Lehtonen	a65d05bd9c	source/panic_fake: rename module to make lint happy	2020-05-20 21:48:06 +03:00

1 2

64 commits