node-feature-discovery

mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2025-03-30 19:54:46 +00:00

Author	SHA1	Message	Date
AhmedGrati	a5624cc8ca	chore: update config file in helm deployment Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-09-06 16:05:02 +01:00
Kubernetes Prow Robot	407a610e0c	Merge pull request #1182 from fmuyassarov/disable-hooks-by-default hooks: disable hooks by default from v0.14	2023-06-22 04:43:40 -07:00
Muyassarov, Feruzjon	19527be924	hooks: disable hooks by default We have deprecated hooks in v0.12.0 but kept it enabled by default. Starting from v0.14 we are starting to disable it by default and plan to fully remove it in the near future. Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2023-06-07 13:04:23 +03:00
Hairong Chen	e8a00ba7da	cpu: Discover TDX guests based on cpuid information NFD already has the capability to discover whether baremetal / host machines support Intel TDX. Now, the next step is to add support for discovering whether a node is TDX protected (as in, a virtual machine started using Intel TDX). In order to do so, we've decided to go for a new `cpu-security.tdx` property, called `protected` (`cpu-security.tdx.protected`). Signed-off-by: Hairong Chen <hairong.chen@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-06-05 11:06:28 +02:00
AhmedGrati	b3cfe17392	feat: parallelize nodes update This PR aims to optimize the process of updating nodes with corresponding features. In fact, previously, we were updating nodes sequentially even though they are independent from each other. Therefore, we integrated new components: LabelersNodePool which is responsible for spininng a goroutine whenever there's a request for updating nodes, and a Workqueue which is responsible for holding nodes names that should be updated. Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-06-02 11:41:50 +01:00
Kubernetes Prow Robot	70d5ef477f	Merge pull request #1219 from PiotrProkop/leader-elect Add leader election for nfd-master	2023-05-22 00:36:21 -07:00
PiotrProkop	272fd4784f	Add new flag enable-leader-election for nfd-master. It allows NFD-master to be run in active-passive way when running multiple instances of NFD-master to prevent multiple components from updating same custom resources. Signed-off-by: PiotrProkop <pprokop@nvidia.com>	2023-05-15 13:30:07 +02:00
Markus Lehtonen	1200fd05c5	topology-updater: use node IP in the default configz URI Use a separate NODE_ADDRESS environment variable in the default value of -kubelet-config-uri (instead of NODE_NAME that was previously used). Also change the kustomize and Helm deployments to set this variable to node IP address. This should make the default deployment more robust, making it work in scenarios where node name does not resolve to the node ip, e.g. nodename != hostname.	2023-05-05 13:29:51 +03:00
Markus Lehtonen	c8a722b7c3	deployment/kustomize: drop pod-resources mount for topology-updater This mount is redundant as it's already included in the kubelet state files (/var/lib/kubelet) mount.	2023-05-04 11:06:55 +03:00
AhmedGrati	7917434d38	feat: add master resync period configurability This PR adds a config option for setting the NFD API controller resync period. The resync period is only activated when the NodeFeature API has been enabled (with -enable-nodefeature-api). Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-04-24 11:52:38 +02:00
Kubernetes Prow Robot	8d71ed6755	Merge pull request #1086 from AhmedGrati/feat-support-builtin-kernel-mods feat: support builtin kernel mods	2023-04-13 10:30:40 -07:00
AhmedGrati	109caa1f28	feat: support builtin kernel mods This PR adds the combination of dynamic and builtin kernel modules into one feature called `kernel.enabledmodule`. It's a superset of the `kernel.loadedmodule` feature. Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-04-13 10:19:24 +01:00
Kubernetes Prow Robot	193c552b33	Merge pull request #1084 from AhmedGrati/feat-add-master-config-file feat: add master config file	2023-04-04 10:41:40 -07:00
AhmedGrati	3fff409f6d	Add master config file Similar to the nfd-worker, in this PR we want to support the dynamic run-time configurability through a config file for the nfd-master. We'll use a json or yaml configuration file along with the fsnotify in order to watch for changes in the config file. As a result, we're allowing dynamic control of logging params, allowed namespaces, extended resources, label whitelisting, and denied namespaces. Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-04-03 09:52:09 +01:00
Talor Itzhak	8afd819132	deployment/topology-updater: add mount for kubelet state dir This mount is needed for watching the state files Signed-off-by: Talor Itzhak <titzhak@redhat.com>	2023-03-12 12:43:13 +02:00
Markus Lehtonen	33a1e3d114	kustomize: drop mount for kubelet config in topology-updater We use the configz endpoint nowadays.	2023-03-09 17:48:56 +02:00
Talor Itzhak	8b5918a2e9	kustomize: topology-updater: enable the configuration via kustomization Add a kustomization file with a config example for the exclude-list. Signed-off-by: Talor Itzhak <titzhak@redhat.com>	2022-11-21 21:31:14 +02:00
Garrybest	3ec1b94020	get kubelet config from configz Signed-off-by: Garrybest <garrybest@foxmail.com>	2022-11-08 23:52:35 +08:00
Feruzjon Muyassarov	56d5da2ce0	Add a config option to disable hooks of local feature Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-09-01 10:58:31 +03:00
Markus Lehtonen	7e8f96e7e1	deployment: drop legacy custom rules from the worker conf sample They are still supported but no need to advertise them.	2021-12-22 09:21:26 +02:00
Markus Lehtonen	df6909ed5e	nfd-worker: add core.featureSources config option Add a configuration option for controlling the enabled "raw" feature sources. This is useful e.g. in testing and development, plus it also allows fully shutting down discovery of features that are not needed in a deployment. Supplements core.labelSources which controls the enablement of label sources.	2021-12-03 09:42:35 +02:00
Markus Lehtonen	ad9c7dfa1e	nfd-worker: rename config option 'sources' to 'labelSources' The goal is to make the name more descriptive. Also keeping in mind a possible future addition a 'featureSources' option (or similar) for controlling the feature discovery.	2021-12-01 17:11:49 +02:00
Markus Lehtonen	f75303ce43	pkg/apis/nfd: add variables to rule spec and support backreferences Support backreferencing of output values from previous rules. Enables complex rule setups where custom features are further combined together to form even more sophisticated higher level labels. The labels created by preceding rules are available as a special 'rule.matched' feature (for matchFeatures to use). If referencing rules accross multiple configs/CRDs care must be taken with the ordering. Processing order of rules in nfd-worker: 1. Static rules 2. Files from /etc/kubernetes/node-feature-discovery/custom.d/ in alphabetical order. Subdirectories are processed by reading their files in alphabetical order. 3. Custom rules from main nfd-worker.conf In nfd-master, NodeFeatureRule objects are processed in alphabetical order (based on their metadata.name). This patch also adds new 'vars' fields to the rule spec. Like 'labels', it is a map of key-value pairs but no labels are generated from these. The values specified in 'vars' are only added for backreferencing into the 'rules.matched' feature. This may by desired in schemes where the output of certain rules is only used as intermediate variables for other rules and no labels out of these are wanted. An example setup: - name: "kernel feature" labels: kernel-feature: matchFeatures: - feature: kernel.version matchExpressions: major: {op: Gt, value: ["4"]} - name: "intermediate var feature" vars: nolabel-feature: "true" matchFeatures: - feature: cpu.cpuid matchExpressions: AVX512F: {op: Exists} - feature: pci.device matchExpressions: vendor: {op: In, value: ["8086"]} device: {op: In, value: ["1234", "1235"]} - name: top-level-feature matchFeatures: - feature: rule.matched matchExpressions: kernel-feature: "true" nolabel-feature: "true"	2021-11-25 12:50:47 +02:00
Kubernetes Prow Robot	da484b7bd3	Merge pull request #550 from marquiz/devel/custom-templating Templating of custom label names	2021-11-23 12:02:51 -08:00
Markus Lehtonen	c8d73666d6	pkg/apis/nfd: support label name templating Support templating of label names in feature rules. It is available both in NodeFeatureRule CRs and in custom rule configuration of nfd-worker. This patch adds a new 'labelsTemplate' field to the rule spec, making it possible to dynamically generate multiple labels per rule based on the matched features. The feature relies on the golang "text/template" package. When expanded, the template must contain labels in a raw <key>[=<value>] format (where 'value' defaults to "true"), separated by newlines i.e.: - name: <rule-name> labelsTemplate: \| <label-1>[=<value-1>] <label-2>[=<value-2>] ... All the matched features of 'matchFeatures' directives are available for templating engine in a nested data structure that can be described in yaml as: . <domain-1>: <key-feature-1>: - Name: <matched-key> - ... <value-feature-1: - Name: <matched-key> Value: <matched-value> - ... <instance-feature-1>: - <attribute-1-name>: <attribute-1-value> <attribute-2-name>: <attribute-2-value> ... - ... <domain-2>: ... That is, the per-feature data available for matching depends on the type of feature that was matched: - "key features": only 'Name' is available - "value features": 'Name' and 'Value' can be used - "instance features": all attributes of the matched instance are available NOTE: In case of matchAny is specified, the template is executed separately against each individual matchFeatures matcher and the eventual set of labels is a superset of all these expansions. Consider the following: - name: <name> labelsTemplate: <template> matchAny: - matchFeatures: <matcher#1> - matchFeatures: <matcher#2> matchFeatures: <matcher#3> In the example above (assuming the overall result is a match) the template would be executed on matcher#1 and/or matcher#2 (depending on whether both or only one of them match), and finally on matcher#3, and all the labels from these separate expansions would be created (i.e. the end result would be a union of all the individual expansions). NOTE 2: The 'labels' field has priority over 'labelsTemplate', i.e. labels specified in the 'labels' field will override any labels originating from the 'labelsTemplate' field. A special case of an empty match expression set matches everything (i.e. matches/returns all existing keys/values). This makes it simpler to write templates that run over all values. Also, makes it possible to later implement support for templates that run over all _keys_ of a feature. Some example configurations: - name: "my-pci-template-features" labelsTemplate: \| {{ range .pci.device }}intel-{{ .class }}-{{ .device }}=present {{ end }} matchFeatures: - feature: pci.device matchExpressions: class: {op: InRegexp, value: ["^06"]} vendor: ["8086"] - name: "my-system-template-features" labelsTemplate: \| {{ range .system.osrelease }}system-{{ .Name }}={{ .Value }} {{ end }} matchFeatures: - feature: system.osRelease matchExpressions: ID: {op: Exists} VERSION_ID.major: {op: Exists} Imaginative template pipelines are possible, of course, but care must be taken in order to produce understandable and maintainable rule sets.	2021-11-23 21:03:22 +02:00
Markus Lehtonen	c3da439d21	source/memory: implement FeatureSource Separate feature discovery and creation of feature labels. Generalize the discovery of nvdimm devices so that they can be matched in custom label rules in a similar fashion as pci and usb devices. Available attributes for matching nvdimm devices are limited to: - devtype - mode For numa we now detect the number of numa nodes which can be matched agains in custom label rules. Labels created by the memory feature source are unchanged. The new features being detected are available in custom rules only. Example custom rule: - name: "my memory rule" labels: my-memory-feature: "true" matchFeatures: - feature: memory.numa matchExpressions: "node_count": {op: Gt, value: ["3"]} - feature: memory.nv matchExpressions: "devtype" {op: In, value: ["nd_dax"]} Also, add minimalist unit test.	2021-11-23 15:08:15 +02:00
Markus Lehtonen	9a02b544a2	source/network: implement FeatureSource Separate feature discovery and creation of feature labels. Generalize the feature discovery so that network devices can be matched in custom label rules in a similar fashion as pci and usb devices. Available attributes for matching are: - operstate - speed - sriov_numvfs - sriov_totalvfs Labels created by the network feature source are unchanged. The new features being detected are available in custom rules only. Example custom rule: - name: "my network rule" labels: my-network-feature: "true" matchFeatures: - feature: network.device matchExpressions: "operstate": { op: In, value: ["up"] } "sriov_numvfs": { op: Gt, value: ["9"] } Also, add minimalist unit test.	2021-11-23 10:05:38 +02:00
Markus Lehtonen	0a96359f29	deployment: fix mistake in example worker config	2021-11-23 10:01:41 +02:00
Kubernetes Prow Robot	99d3251c42	Merge pull request #649 from marquiz/devel/storage-feature-source source/storage: implement FeatureSource	2021-11-22 11:31:32 -08:00
Kubernetes Prow Robot	882320f523	Merge pull request #608 from marquiz/devel/deployment-base deployment: clean up base/topologyupdater-daemonset	2021-11-18 09:13:02 -08:00
Markus Lehtonen	999628418b	source/storage: implement FeatureSource Separate feature discovery and creation of feature labels. Generalize the feature discovery so that block devices can be matched in custom label rules in a similar fashion as pci and usb devices. This extends the discovery to other block queue attributes than 'rotational': now we also detect 'dax', 'nr_zones' and 'zoned'. Labels created by the storage feature source are unchanged. The new features being detected are available in custom rules only. Example custom rules: - name: "my block rule 1" labels: my-block-feature-1: "true" matchFeatures: - feature: storage.block "rotational": {op: In, value: ["0"]} - name: "my block rule 2" labels: my-block-feature-2: "true" matchFeatures: - feature: storage.block "zoned": {op: In, value: [“host-aware”, “host-managed”]} Also, add minimalist unit test.	2021-11-18 14:58:33 +02:00
Markus Lehtonen	6cbed379df	source/custom: implement matchAny directive Implement a new 'matchAny' directive in the new rule format, building on top of the previously implemented 'matchFeatures' matcher. MatchAny applies a logical OR over multiple matchFeatures directives. That is, it allows specifying multiple alternative matchers (at least one of which must match) in a single label rule. The configuration format for the new matchers is matchAny: - matchFeatures: - feature: <domain>.<feature> matchExpressions: <attribute>: op: <operator> value: - <list-of-values> - matchFeatures: ... A configuration example. In order to require a cpu feature, kernel module and one of two specific PCI devices (taking use of the shortform notation): - name: multi-device-test labels: multi-device-feature: "true" matchFeatures: - feature: kernel.loadedmodule matchExpressions: [driver-module] - feature: cpu.cpuid matchExpressions: [AVX512F] matchAny: - matchFeatures: - feature; pci.device matchExpressions: vendor: "8086" device: "1234" - matchFeatures: - feature: pci.device matchExpressions: vendor: "8086" device: "abcd"	2021-11-12 16:51:30 +02:00
Markus Lehtonen	e206f0b86b	source/custom: implement generic feature matching Implement generic feature matchers that cover all feature sources (that implement the FeatureSource interface). The implementation relies on the unified data model provided by the FeatureSource interface as well as the generic expression-based rule processing framework that was added to the source/custom/expression package. With this patch any new features added will be automatically available for custom rules, without any additional work. Rule hierarchy follows the source/feature hierarchy by design. This patch introduces a new format for custom rule specifications, dropping the 'value' field and introducing new 'labels' field which makes it possible to specify multiple labels per rule. Also, in the new format the 'name' field is just for reference and no matching label is created. The new generic rules are available in this new rule format under a 'matchFeatures. MatchFeatures implements a logical AND over an array of per-feature matchers - i.e. a match for all of the matchers is required. The goal of the new rule format is to make it better follow K8s API design guidelines and make it extensible for future enhancements (e.g. addition of templating, taints, annotations, extended resources etc). The old rule format (with cpuID, kConfig, loadedKMod, nodename, pciID, usbID rules) is still supported. The rule format (new vs. old) is determined at config parsing time based on the existence of the 'matchOn' field. The new rule format and the configuration format for the new matchFeatures field is - name: <rule-name> labels: <key>: <value> ... matchFeatures: - feature: <domain>.<feature> matchExpressions: <attribute>: op: <operator> value: - <list-of-values> - feature: <domain>.<feature> ... Currently, "cpu", "kernel", "pci", "system", "usb" and "local" sources are covered by the matshers/feature selectors. Thus, the following features are available for matching with this patch: - cpu.cpuid: <cpuid-flag>: <exists/does-not-exist> - cpu.cstate: enabled: <bool> - cpu.pstate: status: <string> turbo: <bool> scaling_governor: <string> - cpu.rdt: <rdt-feature>: <exists/does-not-exist> - cpu.sst: bf.enabled: <bool> - cpu.topology: hardware_multithreading: <bool> - kernel.config: <flag-name>: <string> - kernel.loadedmodule: <module-name>: <exists/does-not-exist> - kernel.selinux: enabled: <bool> - kernel.version: major: <int> minor: <int> revision: <int> full: <string> - system.osrelease: <key-name>: <string> VERSION_ID.major: <int> VERSION_ID.minor: <int> - system.name: nodename: <string> - pci.device: <device-instance>: class: <string> vendor: <string> device: <string> subsystem_vendor: <string> susbystem_device: <string> sriov_totalvfs: <int> - usb.device: <device-instance>: class: <string> vendor: <string> device: <string> serial: <string> - local.label: <label-name>: <string> The configuration also supports some "shortforms" for convenience: matchExpressions: [<attr-1>, <attr-2>=<val-2>] --- matchExpressions: <attr-3>: <attr-4>: <val-4> is equal to: matchExpressions: <attr-1>: {op: Exists} <attr-2>: {op: In, value: [<val-2>]} --- matchExpressions: <attr-3>: {op: Exists} <attr-4>: {op: In, value: [<val-4>]} In other words: - feature: kernel.config matchExpressions: ["X86", "INIT_ENV_ARG_LIMIT=32"] - feature: pci.device matchExpressions: vendor: "8086" is the same as: - feature: kernel.config matchExpressions: X86: {op: Exists} INIT_ENV_ARG_LIMIT: {op: In, values: ["32"]} - feature: pci.device matchExpressions: vendor: {op: In, value: ["8086"] Some configuration examples below. In order to match a CPUID feature the following snippet can be used: - name: cpu-test-1 labels: cpu-custom-feature: "true" matchFeatures: - feature: cpu.cpuid matchExpressions: AESNI: {op: Exists} AVX: {op: Exists} In order to match against a loaded kernel module and OS version: - name: kernel-test-1 labels: kernel-custom-feature: "true" matchFeatures: - feature: kernel.loadedmodule matchExpressions: e1000: {op: Exists} - feature: system.osrelease matchExpressions: NAME: {op: InRegexp, values: ["^openSUSE"]} VERSION_ID.major: {op: Gt, values: ["14"]} In order to require a kernel module and both of two specific PCI devices: - name: multi-device-test labels: multi-device-feature: "true" matchFeatures: - feature: kernel.loadedmodule matchExpressions: driver-module: {op: Exists} - pci.device: vendor: "8086" device: "1234" - pci.device: vendor: "8086" device: "abcd"	2021-11-12 16:51:13 +02:00
Markus Lehtonen	e342076a5e	deployment: clean up base/topologyupdater-daemonset The base should really have the very bare minimum. Remove all redundant (at default-value) args and move the others to the specific topologyupdater kustomize component. This also makes these settings re-usable in user-specific overlays (that are not based on topologyupdater-daemonset).	2021-10-06 21:42:31 +03:00
Swati Sehgal	a2c066dc0d	topologyupdater: manifests: topologyupdater deployment files - create an overlay for deployment of all components - create an overlay for just topologyupdater deployment (to be deployed in conjunction with the default overlay) - create a separate overlay for deployment of master and topologyupdater-job Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-09-21 10:48:10 +01:00
Markus Lehtonen	3706de9308	deployment: fix formatting of the worker conf sample	2021-09-17 14:25:48 +03:00
Jorik Jonker	501ff37592	deployment: optional mount of /usr/src This commit makes the mount of /usr/src optional in the Helm chart, and removes it from the kustomization. Reason is that some systems do not have a /usr/src (such as Talos) and have a R/O filesystem. Since /usr/src is optional per FHS 3.0, NFD should not assume its presence. Signed-off-by: Jorik Jonker <jorik@kippendief.biz>	2021-08-26 10:52:26 +02:00
Markus Lehtonen	1f8a6d7819	kustomize: add standard-combined overlay Replicates nfd-daemonset-combined.yaml.template. In addition to the overlay we need to add a separate set of patches under components/common in order to handle the double-container pod.	2021-08-18 15:10:25 +03:00
Markus Lehtonen	8117c099a3	deployment: add kustomize base Implement functionality virtually replicating deployment templates for nfd-master and nfd-worker daemonset (nfd-master.yaml.template and nfd-worker-daemonset.yaml.template) by adding a kustomize overlay named "default". We split the resources into multiple bases (rbac, master and worker-daemonset) so that relevant parts are re-usable in other deployment scenarios added later (e.g. "one-shot job", and "combined daemonset"). This patch adds one component (components/common) doing the required kustomization for the example deployment.	2021-08-18 14:05:57 +03:00

39 commits