node-feature-discovery

mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00

Author	SHA1	Message	Date
Markus Lehtonen	4c955ad72c	nfd-master: update node if no NodeFeature objects are present Correctly handle the case where no NodeFeature objects exist for certain node (and NodeFeature API has been enabled with -enable-nodefeature-api). In this case all the labels should be removed.	2022-12-19 10:22:04 +02:00
Markus Lehtonen	b9c09e6674	nfd-master: update all nodes at startup when NodeFeature API enabled We want to always update all nodes at startup. Without this patch we don't get any update event from the controller if no NodeFeature or NodeFeatureRule objects exist in the cluster. Thus all nodes would stay untouched whereas we really want to remove all labels from all nodes in this case.	2022-12-14 21:49:50 +02:00
Kubernetes Prow Robot	d1b314842c	Merge pull request #989 from marquiz/devel/nodefeature-multi-object nfd-master: handle multiple NodeFeature objects	2022-12-14 07:51:34 -08:00
Markus Lehtonen	740e3af681	nfd-master: implement ratelimiter for nfd api updates Implement a naive ratelimiter for node update events originating from the nfd API. We might get a ton of events in short interval. The simplest example is startup when we get a separate Add event for every NodeFeature and NodeFeatureRule object. Without rate limiting we run "update all nodes" separately for each NodeFeatureRule object, plus, we would run "update node X" separately for each NodeFeature object targeting node X. This is a huge amount of wasted work because in principle just running "update all nodes" once should be enough.	2022-12-14 15:45:43 +02:00
Markus Lehtonen	79ed747be8	nfd-master: handle multiple NodeFeature objects Implement handling of multiple NodeFeature objects by merging all objects (targeting a certain node) into one before processing the data. This patch implements MergeInto() methods for all required data types. With support for multiple NodeFeature objects per node, The "nfd api workflow" can be easily demonstrated and tested from the command line. Creating the folloiwing object (assuming node-n exists in the cluster): apiVersion: nfd.k8s-sigs.io/v1alpha1 kind: NodeFeature metadata: labels: nfd.node.kubernetes.io/node-name: node-n name: my-features-for-node-n spec: # Features for NodeFeatureRule matching features: flags: vendor.domain-a: elements: feature-x: {} attributes: vendor.domain-b: elements: feature-y: "foo" feature-z: "123" instances: vendor.domain-c: elements: - attributes: name: "elem-1" vendor: "acme" - attributes: name: "elem-2" vendor: "acme" # Labels to be created labels: vendor-feature.enabled: "true" vendor-setting.value: "100" will create two feature labes: feature.node.kubernetes.io/vendor-feature.enabled: "true" feature.node.kubernetes.io/vendor-setting.value: "100" In addition it will advertise hidden/raw features that can be used for custom rules in NodeFeatureRule objects. Now, creating a NodeFeatureRule object: apiVersion: nfd.k8s-sigs.io/v1alpha1 kind: NodeFeatureRule metadata: name: my-rule spec: rules: - name: "my feature rule" labels: "my-feature": "true" matchFeatures: - feature: vendor.domain-a matchExpressions: feature-x: {op: Exists} - feature: vendor.domain-c matchExpressions: vendor: {op: In, value: ["acme"]} will match the features in the NodeFeature object above and cause one more label to be created: feature.node.kubernetes.io/my-feature: "true"	2022-12-14 15:44:52 +02:00
Markus Lehtonen	9f0806593d	nfd-master: rename -featurerules-controller flag to -crd-controller Deprecate the '-featurerules-controller' command line flag as the name does not describe the functionality anymore: in practice it controls the CRD controller handling both NodeFeature and NodeFeatureRule objects. The patch introduces a duplicate, more generally named, flag '-crd-controller'. A warning is printed in the log if '-featurerules-controller' flag is encountered.	2022-12-14 10:23:45 +02:00
Markus Lehtonen	6ddd87e465	nfd-master: support NodeFeature objects Add initial support for handling NodeFeature objects. With this patch nfd-master watches NodeFeature objects in all namespaces and reacts to changes in any of these. The node which a certain NodeFeature object affects is determined by the "nfd.node.kubernetes.io/node-name" annotation of the object. When a NodeFeature object targeting certain node is changed, nfd-master needs to process all other objects targeting the same node, too, because there may be dependencies between them. Add a new command line flag for selecting between gRPC and NodeFeature CRD API as the source of feature requests. Enabling NodeFeature API disables the gRPC interface. -enable-nodefeature-api enable NodeFeature CRD API for incoming feature requests, will disable the gRPC interface (defaults to false) It is not possible to serve gRPC and watch NodeFeature objects at the same time. This is deliberate to avoid labeling races e.g. by nfd-worker sending gRPC requests but NodeFeature objects in the cluster "overriding" those changes (labels from the gRPC requests will get overridden when NodeFeature objects are processed).	2022-12-14 07:31:28 +02:00
Markus Lehtonen	237494463b	nfd-worker: support creating NodeFeatures object Support the new NodeFeatures object of the NFD CRD api. Add two new command line options to nfd-worker: -kubeconfig specifies the kubeconfig to use for connecting k8s api (defaults to empty which implies in-cluster config) -enable-nodefeature-api enable the NodeFeature CRD API for communicating node features to nfd-master, will also automatically disable gRPC (defgault to false) No config file option for selecting the API is available as there should be no need for dynamically selecting between gRPC and CRD. The nfd-master configuration must be changed in tandem and it is safer (and avoid awkward configuration races) to configure the whole NFD deployment at once. Default behavior of nfd-worker is not changed i.e. NodeFeatures object creation is not enabled by default (but must be enabled with the command line flag). The patch also updates the kustomize and Helm deployment, adding RBAC rules for nfd-worker and updating the example worker configuration.	2022-12-14 07:31:28 +02:00
Markus Lehtonen	d1c91e129a	apis/nfd: update auto-generated code	2022-12-14 07:31:28 +02:00
Markus Lehtonen	59ebff46c9	apis/nfd: add CRD for communicating node features Add a new NodeFeature CRD to the nfd Kubernetes API to communicate node features over K8s api objects instead of gRPC. The new resource is namespaced which will help the management of multiple NodeFeature objects per node. This aims at enabling 3rd party detectors for custom features. In addition to communicating raw features the NodeFeature object also has a field for directly requesting labels that should be applied on the node object. Rename the crd deployment file to nfd-api-crds.yaml so that it matches the new content of the file. Also, rename the Helm subdir for CRDs to match the expected chart directory structure.	2022-12-14 07:31:28 +02:00
Markus Lehtonen	079655b42c	nfd-master: add error checking for CRD controller creation	2022-12-14 00:27:27 +02:00
Feruzjon Muyassarov	b296bdf0b3	update test functions according to upstream deprecated/removed methods Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-12-13 12:12:50 +02:00
Kubernetes Prow Robot	733fb5deaa	Merge pull request #984 from marquiz/devel/worker-namespace nfd-worker: detect the namespace it is running in	2022-12-09 07:10:11 -08:00
Markus Lehtonen	f13ed2d91c	nfd-topology-updater: update NodeResourceTopology objects directly Drop the gRPC communication to nfd-master and connect to the Kubernetes API server directly when updating NodeResourceTopology objects. Topology-updater already has connection to the API server for listing Pods so this is not that dramatic change. It also simplifies the code a lot as there is no need for the NFD gRPC client and no need for managing TLS certs/keys. This change aligns nfd-topology-updater with the future direction of nfd-worker where the gRPC API is being dropped and replaced by a CRD-based API. This patch also update deployment files and documentation to reflect this change.	2022-12-08 11:03:22 +02:00
Markus Lehtonen	87b92f88ca	nfd-worker: detect the namespace it is running in Implement detection of kubernetes namespace by reading file /var/run/secrets/kubernetes.io/serviceaccount/namespace Aa a fallback (if the file is not accessible) we take namespace from KUBERNETES_NAMESPACE environment variable. This is useful for e.g. testing and development where you might run nfd-worker directly from the command line on a host system.	2022-12-08 10:34:52 +02:00
Feruzjon Muyassarov	2bdf427b89	nfd-master logic update for setting node taints This commits extends NFD master code to support adding node taints from NodeFeatureRule CR. We also introduce a new annotation for taints which helps to identify if the taint set on node is owned by NFD or not. When user deletes the taint entry from NodeFeatureRule CR, NFD will remove the taint from the node. But to avoid accidental deletion of taints not owned by the NFD, it needs to know the owner. Keeping track of NFD set taints in the annotation can be used during the filtering of the owner. Also enable-taints flag is added to allow users opt in/out for node tainting feature. The flag takes precedence over taints defined in NodeFeatureRule CR. In other words, if enbale-taints is set to false(disabled) and user still defines taints on the CR, NFD will ignore those taints and skip them from setting on the node. Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-12-02 17:25:00 +02:00
Feruzjon Muyassarov	532e1193ce	Add taints field to NodeFeatureRule CR spec Extend NodeFeatureRule Spec with taints field to allow users to specify the list of the taints they want to be set on the node if rule matches. Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-12-02 17:25:00 +02:00
Markus Lehtonen	eb8e29c80a	nfd-worker: drop deprecated command line flags Drop the following flags that were deprecated already in v0.8.0: -sleep-interval (replaced by core.sleepInterval config file option) -label-whitelist (replaced by core.labelWhiteList config file option) -sources (replaced by -label-sources flag)	2022-11-23 22:33:51 +02:00
Talor Itzhak	5b0788ced4	topology-updater: introduce exclude-list The exclude-list allows to filter specific resource accounting from NRT's objects per node basis. The CRs created by the topology-updater are used by the scheduler-plugin as a source of truth for making scheduling decisions. As such, this feature allows to hide specific information from the scheduler, which in turn will affect the scheduling decision. A common use case is when user would like to perform scheduling decisions which are based on a specific resource. In that case, we can exclude all the other resources which we don't want the scheduler to exemine. The exclude-list is provided to the topology-updater via a ConfigMap. Resource type's names specified in the list should match the names as shown here: https://pkg.go.dev/k8s.io/api/core/v1#ResourceName This is a resurrection of an old work started here: https://github.com/kubernetes-sigs/node-feature-discovery/pull/545 Signed-off-by: Talor Itzhak <titzhak@redhat.com>	2022-11-21 14:08:25 +02:00
Garrybest	3ec1b94020	get kubelet config from configz Signed-off-by: Garrybest <garrybest@foxmail.com>	2022-11-08 23:52:35 +08:00
Feruzjon Muyassarov	7ea0e0b0a7	Add argument to updateNodeFeatures method to pass client from caller This commit adds an argument to updateNodeFeatures method for receiving client argument, which currently gets initialized within the method itself. This is a minor improvement for https://github.com/kubernetes-sigs/node-feature-discovery/pull/910. Ref:https://github.com/kubernetes-sigs/node-feature-discovery/pull/910#discussion_r1012703631 Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-11-06 22:37:11 +02:00
Markus Lehtonen	7c24b50f74	apis/nfd: fix NodeFeatureRule templating Fix handling of templates that got broken in `b907d07d7e` when "flattening" the internal data structure of features. That happened because the golang text/template format uses dots to reference fields of a struct / elements of a map (i.e. 'foo.bar' means that 'bar' must be a sub-element of foo). Thus, using dots in our feature names (e.g. 'cpu.cpuid') means that that hierarchy must be reflected in the data structure that is fed to the templating engine. Thus, for templates we're now stuck stuck with two level hierarchy. It doesn't really matter for now as all our features follow that naming patter. We might be able to overcome this limitation e.g. by using reflect but that's left as a future exercise.	2022-10-25 23:37:27 +03:00
Kubernetes Prow Robot	a65ee959b9	Merge pull request #925 from marquiz/devel/feature-api-flatten apis/nfd: flatten the structure of features data type	2022-10-24 01:14:26 -07:00
Francesco Romani	700d9e215c	topology-updater: continue looping on scan error Scanning podresources can temporarily fail; the previous code was mistakenly not rearming the loop condition when this occurred, effectively stopping the monitoring. Rather, we should always pool and bail out on unrecoverable error or when asked to stop. Signed-off-by: Francesco Romani <fromani@redhat.com>	2022-10-20 10:08:13 +02:00
Markus Lehtonen	9ea787bc99	apis/nfd: update auto-generated code Re-generate after the latest API change. Involves renaming the crd spec files.	2022-10-18 18:41:53 +03:00
Markus Lehtonen	b907d07d7e	apis/nfd: flatten the structure of features data type Flatten the data structure that stores features, dropping the "domain" level from the data model. That extra level of hierarchy brought little benefit but just caused some extra complexity, instead. The new structure nicely matches what we have in the NodeFeatureRule object (the matchFeatures field of uses the same flat structure with the "feature" field having a value <domain>.<feature>, e.g. "kernel.version"). This is pre-work for introducing a new "node feature" CRD that contains the raw feature data. It makes the life of both users and developers easier when both CRDs, plus our internal code, handle feature data in a similar flat structure.	2022-10-18 18:37:28 +03:00
Markus Lehtonen	c3caf687c8	apis/nfd: update autogenerated code Update and migrate auto-generated code after removing pkg/api/feature.	2022-10-15 07:42:20 +03:00
Markus Lehtonen	0e1d4a9046	apis/nfd: migrate pkg/api/feature Move the previously-protobuf-only internal "feature api" over to the public "nfd api" package. This is in preparation for introducing a new CRD API for communicating features. This patch carries no functional changes. Just moving code around.	2022-10-15 07:42:20 +03:00
Feruzjon Muyassarov	71434a1392	Standardize "k8s.io/api/core/v1" package short name Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-10-15 02:22:41 +03:00
Feruzjon Muyassarov	e79f09deb2	Error strings should not be capitalized Error strings should not be capitalized (ST1005) & remove the redundancy from array, slice or map composite literals. Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-10-14 15:43:18 +03:00
Kubernetes Prow Robot	b06f2f7c8b	Merge pull request #911 from marquiz/devel/master-grpc-refactor nfd-master: refactor gRPC into a separate method	2022-10-12 03:23:00 -07:00
Markus Lehtonen	06bd6c0609	nfd-worker: refactor gRPC connection logic Make the NoPublish config flag a more direct control point for whether to publishing features. This patch is pre-work for adding support for other clients (upcoming new CRD API) in nfd-worker.	2022-10-11 17:02:33 +03:00
Markus Lehtonen	edcaf9a3bb	nfd-master: refactor gRPC into a separate method Refactor the code so that the initialization and running of the gRPC server is done in a separate function. The goal is to make the code more maintainable in terms of disabling (and eventually removing) the gRPC functionality in the future.	2022-10-07 14:45:44 +03:00
Markus Lehtonen	a00cdc2b61	pkg/utils: move hostpath helpers from source to utils Refactor the code, moving the hostpath helper functionality to new "pkg/utils/hostpath" package. This breaks odd-ish dependency "pkg/utils" -> "source".	2022-10-06 14:28:24 +03:00
Kubernetes Prow Robot	4097198848	Merge pull request #908 from marquiz/devel/type-rename pkg/api/feature: rename types	2022-10-06 01:59:51 -07:00
Markus Lehtonen	7f806e8c45	pkg/api/feature: update auto-generated code Complete the previous renaming.	2022-10-06 11:25:01 +03:00
Markus Lehtonen	abdbd420d1	pkg/api/feature: rename types Sync type names with NFD documentation. Aims at making the codebase easier to follow.	2022-10-06 11:25:01 +03:00
Markus Lehtonen	c1e6b41e56	apis/nfd: move annotation and label consts from nfd-master Move consts related to NFD annotations and labels from nfd-master to the api. Makes them more logically accessible for clients.	2022-10-06 11:23:56 +03:00
Kubernetes Prow Robot	906aad6717	Merge pull request #906 from marquiz/devel/master-controller-rename nfd-master: rename crd controller	2022-10-06 01:19:52 -07:00
Markus Lehtonen	658ffaa6a5	nfd-master: rename crd controller Prepare for adding support for other nfd api objects. Just rename file and some symbols, no functional changes.	2022-10-04 20:23:24 +03:00
Markus Lehtonen	11fd19fb7a	nfd-worker: rename some symbols Some renames in preparation for adding support for NFD CRD API client. I.e. a second client in addition to the existing gRPC client.	2022-10-04 17:18:25 +03:00
Kubernetes Prow Robot	dcc02b9787	Merge pull request #901 from fmuyassarov/add-shortname Set shortName for NodeFeatureRule CRD	2022-09-29 03:50:54 -07:00
Feruzjon Muyassarov	60f270d40d	Set shortName for NodeFeatureRule CRD This patch adds a kubebuilder marker to add a short name nfr for NodeFeatureRule CRD. Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-09-28 12:18:49 +03:00
Markus Lehtonen	8b652ab8ec	nfd-master: log if node was modified (or not) Be a bit more verbose what is happning.	2022-09-21 14:23:37 +03:00
Markus Lehtonen	389a3d4e2e	nfd-master: drop cleanup of ancient incubator labels Remove the cleanup code that removes ancient NFD labels with the node.alpha.kubernetes-incubator.io/ prefix. This label namespace was deprecated/dropped already in v0.4.0 so it should be safe to drop this code.	2022-09-20 19:56:58 +03:00
Markus Lehtonen	ffa35427cd	nfd-client: don't use deprecated grpc.WithInsecure() Replace deprecated grpc.WithInsecure() with grpc.WithTransportCredentials and insecure.NewCredentials(). Makes golangci-lint pass muster. enter the commit message for your changes. Lines starting	2022-09-09 11:07:22 +03:00
Markus Lehtonen	12e859d50c	Drop deprecated io/ioutil package Makes golanci-lint happy.	2022-09-08 14:26:02 +03:00
Markus Lehtonen	98228d2069	Update auto-generated artefacts Latest gofmt changes and update to go v1.19 induce some changes in the generated files.	2022-09-08 12:45:20 +03:00
Markus Lehtonen	2bbfe3edc8	Run gofmt Golang v1.19 was not happy with our code comments.	2022-09-08 12:43:15 +03:00
Kubernetes Prow Robot	4e6a718dfe	Merge pull request #865 from stek29/fix-864 Fix templates for NodeFeatureRule with MatchAny	2022-08-23 09:55:24 -07:00
Viktor Oreshkin	6fd12a2da7	apis/nfd: fix templates with MatchAny only Signed-off-by: Viktor Oreshkin <imselfish@stek29.rocks>	2022-08-23 18:00:44 +03:00
Markus Lehtonen	2c92e1dcff	logging: do not use %w with klog.Errorf It is not recognized (and does not work like with fmt.Errorf) so use %v instead.	2022-08-22 14:39:52 +03:00
Viktor Oreshkin	4375e08e39	apis/nfd: add more tests for templates test that NodeFeatureRule templates work with empty MatchFeatures, but with MatchAny. this test would fail, higligting an issue which is fixed in next commit. see #864. Signed-off-by: Viktor Oreshkin <imselfish@stek29.rocks>	2022-08-22 02:27:55 +03:00
Markus Lehtonen	889e4c1351	nfd-master: more fixes to log messages Use correct name for the CR (NodeFeatureRule) object. Also, the resource is cluster-scoped so don't print the namespace.	2022-08-17 10:07:26 +03:00
Markus Lehtonen	f5ee836bbf	nfd-master: fix incorrect log messages in crd controller	2022-08-16 16:39:27 +03:00
Markus Lehtonen	38e763e36c	Refresh auto-generated files	2022-08-10 14:24:33 +03:00
Markus Lehtonen	345e9bf72c	apis/nfd: revert the type hack Revert the hack that was a workaround for issues with k8s deepcopy-gen. New deepcopy-gen is able to generate code correctly without issues so this is not needed anymore. Also, removing this hack solves issues with object validation when creating NodeFeatureRules programmatically with nfd go-client. This is needed later with NodeFeatureRules e2e-tests. Logically reverts `f3cc109f99`.	2022-08-10 14:24:33 +03:00
Markus Lehtonen	ac3030ce48	Re-generate files Refresh auto-generated files using the new conainerized approach.	2022-08-10 09:47:23 +03:00
Markus Lehtonen	b7658c25fd	generate: update mockery to latest version In order to be able to run it on Go v1.18.	2022-08-10 09:47:23 +03:00
Markus Lehtonen	136c036d4d	Drop the iommu source It was deprecated in v0.10.0.	2022-06-14 15:00:29 +03:00
Markus Lehtonen	36341bf4c7	apis/nfd: empty match expression set returns no features for templates This patch changes a rare corner case of custom label rules with an empty set of matchexpressions. The patch removes a special case where an empty match expression set matched everything and returned all feature elements for templates to consume. With this patch the match expression set logically evaluates all expressions in the set and returns all matches - if there are no expressions there are no matches and no matched features are returned. However, the overall match result (determining if "non-template" labels will be created) in this special case will be "true" as before as none of the zero match expressions failed. The former behavior was somewhat illogical and counterintuitive: having 1 to N expressions matched and returned 1 to N features (at most), but, having 0 expressions always matched everything and returned all features. This was some leftover proof-of-concept functionality (for some possible future extensions) that should have been removed before merging.	2022-03-24 11:43:42 +02:00
Tuomas Katila	2ceafe83b7	topologyupdater: Prevent crash with incorrect node id It's possible for device plugins to advertise non-existent numa node ids that cause topology updater to crash. Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>	2022-03-15 11:16:02 +02:00
Dipto Chakrabarty	19a57789ad	Additional Lint Fixes in Codebase (#779 ) * fix comments and conditonals to fix lint issues * more linter fixes and spelling fixes * fix linter issues based on feedback	2022-03-02 17:12:46 -08:00
Markus Lehtonen	f9b4ba87a8	tls: require min TLS version 1.3 Deny deprecated TLS versions (1.0 and 1.1). We don't really excpect other clients than NFD itself so we can just request the latest version.	2022-02-25 10:08:37 +02:00
Kubernetes Prow Robot	0c330b1a35	Merge pull request #736 from marquiz/devel/grpc-stop nfd-master: do graceful stop of gRPC server	2022-01-21 03:05:59 -08:00
Markus Lehtonen	e53d053475	nfd-master: do graceful stop of gRPC server	2022-01-21 12:03:07 +02:00
Markus Lehtonen	e95a4dd460	nfd-master: print gRPC server error correctly	2022-01-21 11:56:28 +02:00
Mohammed Naser	cf1bc4a34d	Increase timeout in test setups This patch increases the timeout when setting up the NFD master to 5 seconds instead of 1 second to allow for running tests in slow environments.	2022-01-20 18:59:30 -05:00
Dipto Chakrabarty	755294184c	Fix GoLinter Issues in the files (#711 ) * fix linter issues for few files * fix linter issue of exported const Name should have comment or be unexported * fix name lint issue and resolve lints * add changes to comments	2022-01-18 23:12:06 -08:00
Markus Lehtonen	838a375f85	source/iommu: deprecate and disable by default Deprecate the iommu source and disable it by default.	2021-12-20 10:21:29 +02:00
Markus Lehtonen	a6eddbab4f	source: rename TestSource to SupplementalSource Just widen the scope in terms of naming, to cover deprecated and/or experimental sources too, for example.	2021-12-20 10:05:00 +02:00
Markus Lehtonen	bf01875368	nfd-worker: drop 'custom-' prefix from matchFeatures custom rules Do not prefix label names from the new matchFeatures/matchAny custom rules with "custom-". We want to have the same result (set of labels) from a rule independent of whether it has been specified in worker config or in a NodeFeatureRule CRs. Legacy matchOn rules (not available in NodeFeatureRule CRs) are intact, i.e. still prefixed, in order to retain backwards compatibility.	2021-12-09 21:52:40 +02:00
Markus Lehtonen	82e14300a4	source/fake: implement FeatureSource Makes it possible to create fake features for custom rules, enabling testing.	2021-12-07 10:34:41 +02:00
Markus Lehtonen	58e1461d90	nfd-worker: add -feature-sources command line flag Allows controlling (enable/disable) the "raw" feature detection. Especially useful for development and testing.	2021-12-03 09:42:35 +02:00
Markus Lehtonen	df6909ed5e	nfd-worker: add core.featureSources config option Add a configuration option for controlling the enabled "raw" feature sources. This is useful e.g. in testing and development, plus it also allows fully shutting down discovery of features that are not needed in a deployment. Supplements core.labelSources which controls the enablement of label sources.	2021-12-03 09:42:35 +02:00
Markus Lehtonen	2c3a4d1588	nfd-worker: rename nfdWorker.enabledSources to labelSources Refactoring in head of adding new config option for feature sources.	2021-12-02 21:08:46 +02:00
Markus Lehtonen	8cd58af613	nfd-worker: disable sources more easily Make it easier to disable single sources by prefixing the source name with a dash ('-') in the core.sources config option (or -sources cmdline flag).	2021-12-02 10:36:51 +02:00
Markus Lehtonen	773280de65	nfd-worker: provide deprecated core.sources config option Provide backwards compatibility via a deprecated 'core.sources' config file option. This will override 'core.labelSources'. A warning is printed in the log if this option is detected.	2021-12-01 17:11:49 +02:00
Markus Lehtonen	ad9c7dfa1e	nfd-worker: rename config option 'sources' to 'labelSources' The goal is to make the name more descriptive. Also keeping in mind a possible future addition a 'featureSources' option (or similar) for controlling the feature discovery.	2021-12-01 17:11:49 +02:00
Kubernetes Prow Robot	86bfe74cd7	Merge pull request #671 from marquiz/fixes/single-dash-flags Use single-dash format of cmdline flags	2021-12-01 06:45:15 -08:00
Markus Lehtonen	1765a37c6a	pkg/apis/nfd: drop unnecessary else statements	2021-12-01 10:55:50 +02:00
Markus Lehtonen	3f225be081	pkg/apis/nfd: use consistent receiver name for methods of templateHelper	2021-12-01 10:51:47 +02:00
Markus Lehtonen	d07400206f	pkg/apis/nfd/v1alpha1: document exported symbols Add missing comments and fix some existing ones.	2021-12-01 10:46:56 +02:00
Markus Lehtonen	c4f7ab0abe	pkg/api/feature: document exported functions	2021-12-01 10:30:17 +02:00
Markus Lehtonen	a57a25f63c	Use single-dash format of cmdline flags Use the single-dash (i.e. '-option' instead of '--option') format consistently accross log messages and documentation. This is the format that was mostly used, already, and shown by command line help of the binaries, for example.	2021-11-25 18:03:54 +02:00
Markus Lehtonen	b648d005e1	pkg/apis/nfd: support templating of "vars" Support templating of var names in a similar manner as labels. Add support for a new 'varsTemplate' field to the feature rule spec which is treated similarly to the 'labelsTemplate' field. The value of the field is processed through the golang "text/template" template engine and the expanded value must contain variables in <key>=<value> format, separated by newlines i.e.: - name: <rule-name> varsTemplate: \| <label-1>=<value-1> <label-2>=<value-2> ... Similar rules as for 'labelsTemplate' apply, i.e. 1. In case of matchAny is specified, the template is executed separately against each individual matchFeatures matcher. 2. 'vars' field has priority over 'varsTemplate'	2021-11-25 12:50:47 +02:00
Markus Lehtonen	f75303ce43	pkg/apis/nfd: add variables to rule spec and support backreferences Support backreferencing of output values from previous rules. Enables complex rule setups where custom features are further combined together to form even more sophisticated higher level labels. The labels created by preceding rules are available as a special 'rule.matched' feature (for matchFeatures to use). If referencing rules accross multiple configs/CRDs care must be taken with the ordering. Processing order of rules in nfd-worker: 1. Static rules 2. Files from /etc/kubernetes/node-feature-discovery/custom.d/ in alphabetical order. Subdirectories are processed by reading their files in alphabetical order. 3. Custom rules from main nfd-worker.conf In nfd-master, NodeFeatureRule objects are processed in alphabetical order (based on their metadata.name). This patch also adds new 'vars' fields to the rule spec. Like 'labels', it is a map of key-value pairs but no labels are generated from these. The values specified in 'vars' are only added for backreferencing into the 'rules.matched' feature. This may by desired in schemes where the output of certain rules is only used as intermediate variables for other rules and no labels out of these are wanted. An example setup: - name: "kernel feature" labels: kernel-feature: matchFeatures: - feature: kernel.version matchExpressions: major: {op: Gt, value: ["4"]} - name: "intermediate var feature" vars: nolabel-feature: "true" matchFeatures: - feature: cpu.cpuid matchExpressions: AVX512F: {op: Exists} - feature: pci.device matchExpressions: vendor: {op: In, value: ["8086"]} device: {op: In, value: ["1234", "1235"]} - name: top-level-feature matchFeatures: - feature: rule.matched matchExpressions: kernel-feature: "true" nolabel-feature: "true"	2021-11-25 12:50:47 +02:00
Markus Lehtonen	8a4d3161cf	pkg/apis/nfd: stricter format checking for template labels Require that the expanded LabelsTemplate has values. That is, the (expanded) template must consist of key=value pairs separated by newlines. No default value will be assigned and we now return an error if a (non-empty) line not conforming with the key=value format is encountered. Commit `c8d73666d` described that the value defaults to "true" if not specified. That was not the case and we defaulted to an empty string, instead. An example: - name: "my rule" labelsTemplate: \| my.label.1=foo my.label.2= Would create these labels: "my.label.1": "foo" "my.label.2": "" Further, the following: - name: "my failing rule" labelsTemplate: \| my.label.3 will cause an error in the rule processing.	2021-11-24 21:31:35 +02:00
Markus Lehtonen	c8d73666d6	pkg/apis/nfd: support label name templating Support templating of label names in feature rules. It is available both in NodeFeatureRule CRs and in custom rule configuration of nfd-worker. This patch adds a new 'labelsTemplate' field to the rule spec, making it possible to dynamically generate multiple labels per rule based on the matched features. The feature relies on the golang "text/template" package. When expanded, the template must contain labels in a raw <key>[=<value>] format (where 'value' defaults to "true"), separated by newlines i.e.: - name: <rule-name> labelsTemplate: \| <label-1>[=<value-1>] <label-2>[=<value-2>] ... All the matched features of 'matchFeatures' directives are available for templating engine in a nested data structure that can be described in yaml as: . <domain-1>: <key-feature-1>: - Name: <matched-key> - ... <value-feature-1: - Name: <matched-key> Value: <matched-value> - ... <instance-feature-1>: - <attribute-1-name>: <attribute-1-value> <attribute-2-name>: <attribute-2-value> ... - ... <domain-2>: ... That is, the per-feature data available for matching depends on the type of feature that was matched: - "key features": only 'Name' is available - "value features": 'Name' and 'Value' can be used - "instance features": all attributes of the matched instance are available NOTE: In case of matchAny is specified, the template is executed separately against each individual matchFeatures matcher and the eventual set of labels is a superset of all these expansions. Consider the following: - name: <name> labelsTemplate: <template> matchAny: - matchFeatures: <matcher#1> - matchFeatures: <matcher#2> matchFeatures: <matcher#3> In the example above (assuming the overall result is a match) the template would be executed on matcher#1 and/or matcher#2 (depending on whether both or only one of them match), and finally on matcher#3, and all the labels from these separate expansions would be created (i.e. the end result would be a union of all the individual expansions). NOTE 2: The 'labels' field has priority over 'labelsTemplate', i.e. labels specified in the 'labels' field will override any labels originating from the 'labelsTemplate' field. A special case of an empty match expression set matches everything (i.e. matches/returns all existing keys/values). This makes it simpler to write templates that run over all values. Also, makes it possible to later implement support for templates that run over all _keys_ of a feature. Some example configurations: - name: "my-pci-template-features" labelsTemplate: \| {{ range .pci.device }}intel-{{ .class }}-{{ .device }}=present {{ end }} matchFeatures: - feature: pci.device matchExpressions: class: {op: InRegexp, value: ["^06"]} vendor: ["8086"] - name: "my-system-template-features" labelsTemplate: \| {{ range .system.osrelease }}system-{{ .Name }}={{ .Value }} {{ end }} matchFeatures: - feature: system.osRelease matchExpressions: ID: {op: Exists} VERSION_ID.major: {op: Exists} Imaginative template pipelines are possible, of course, but care must be taken in order to produce understandable and maintainable rule sets.	2021-11-23 21:03:22 +02:00
Markus Lehtonen	085af7c2c7	pkg/apis/nfd: helpers for handling templates in Rule names Implement a private helper type (nameTemplateHelper) for handling (executing and caching) of templated names. DeepCopy methods are manually implemented as controller-gen is not able to help with that.	2021-11-23 15:08:53 +02:00
Markus Lehtonen	33fdf75190	nfd-master: process labeling rules from CRs Enable Custom Resource based label creation in nfd-master. This extends the previously implemented controller stub for watching NodeFeatureRule objects. NFD-master watches NodeFeatureRule objects in the cluster and processes the rules on every incoming labeling request from workers. The functionality relies on the "raw features" (identical to how nfd-worker handles custom rules) submitted by nfd-worker, making it independent of the label source configuration of the worker. This means that the labeling functions as expected even if all sources in the worker are disabled. NOTE: nfd-master is stateless and re-labeling only happens on the reception of SetLabelsRequest from the worker – i.e. on intervals specified by the core.sleepInterval configuration option (or -sleep-interval cmdline flag) of each nfd-worker instance. This means that modification/creation of NodeFeatureRule objects does not automatically update the node labels. Instead, the changes only come visible when workers send their labeling requests.	2021-11-23 09:18:07 +02:00
Markus Lehtonen	e8872462dc	nfd-master: add -featurerules-controller flag Add a new command line flag for disabling/enabling the controller for NodeFeatureRule objects. In practice, disabling the controller disables all labels generated from rules in NodeFeatureRule objects.	2021-11-22 16:57:42 +02:00
Markus Lehtonen	e6e32a88c3	nfd-master: implement controller for NodeFeatureRule CRs Implement a simple controller stub that operates on NodeFeatureRule objects. The controller does not yet have any functionality other than logging changes in the (NodeFeatureRule) objecs it is watching. Also update the documentation on the -no-publish flag to match the new functionality.	2021-11-22 16:57:42 +02:00
Markus Lehtonen	237c4f7824	pkg/apihelpers: split out loading of kubeconfig to a separate function Make kubeconfig loading and parsing re-usable for multiple clients.	2021-11-22 16:57:42 +02:00
NHM Tanveer Hossain Khan	856dfdd8b4	Remove fatal logging to error based on the feedback	2021-11-19 16:57:21 -05:00
Markus Lehtonen	6624ab312b	pkg/generated: add code for interacting with CRD API Add auto-generated code for interfacing our CRD API. On top of this, a CR controller can be implemented. This patch uses k8s/code-generator for code generation. Run "make generate" in order to (re-)generate everything. Path to the code-generator repository may need to be specified: K8S_CODE_GENERATOR=path/to/code-generator make apigen Code-generator version 0.20.7 was used to create this patch. Install k8s code-generator tools and clone the repo with: git clone https://github.com/kubernetes/code-generator -b v0.20.7 <path/to/code-generator> go install k8s.io/code-generator/cmd/...(at)v0.20.7	2021-11-17 18:51:34 +02:00
Markus Lehtonen	b96b86bc6c	pkg/apis/nfd: drop excess field from the CRD Drop stale leftover "LabelsTemplate" field from the rule spec.	2021-11-17 16:40:28 +02:00
Markus Lehtonen	8b9df3cf31	source/custom: move rule matching to pkg/apis/nfd Move the rule processing of matchFeatures and matchAny from source/custom package over to pkg/apis/nfd, aiming for better integrity and re-usability of the code. Does not change the CRD API as such, just adds more supportive functions.	2021-11-17 14:02:00 +02:00
Markus Lehtonen	3765ae24d6	pkg/apis/nfd: specify a dedicated type for regexp cache Having a dedicated type makes it possible to specify deepcopy functions for it. We need to do this manually as deepcopy-gen doesn't know how to create copies of regexps.	2021-11-17 13:40:43 +02:00
Markus Lehtonen	f3cc109f99	pkg/apis/nfd: work around issues with k8s deepcopy-gen Without this hack the generated code does not compile.	2021-11-17 13:40:43 +02:00
Markus Lehtonen	c3e2315834	pkg/apis/nfd: specify CRD for custom labeling rules Add a cluster-scoped Custom Resource Definition for specifying labeling rules. Nodes (node features, node objects) are cluster-level objects and thus the natural and encouraged setup is to only have one NFD deployment per cluster - the set of underlying features of the node stays the same independent of how many parallel NFD deployments you have. Our extension points (hooks, feature files and now CRs) can be be used by multiple actors (depending on us) simultaneously. Having the CRD cluster-scoped hopefully drives deployments in this direction. It also should make deployment of vendor-specific labeling rules easy as there is no need to worry about the namespace. This patch virtually replicates the source.custom.FeatureSpec in a CRD API (located in the pkg/apis/nfd/v1alpha1 package) with the notable exception that "MatchOn" legacy rules are not supported. Legacy rules are left out in order to keep the CRD simple and clean. The duplicate functionality in source/custom will be dropped by upcoming patches. This patch utilizes controller-gen (from sigs.k8s.io/controller-tools) for generating the CRD and deepcopy methods. Code can be (re-)generated with "make generate". Install controller-gen with: go install sigs.k8s.io/controller-tools/cmd/controller-gen@v0.7.0 Update kustomize and helm deployments to deploy the CRD.	2021-11-17 13:40:23 +02:00
Markus Lehtonen	0757248055	source/custom: move rule expressions to pkg/apis/nfd/v1alpha1 Create a new package pkg/apis/nfd/v1alpha1 and migrate the custom rule expressions over there. This is the first step in creating a new CRD API for custom rules.	2021-11-16 18:12:16 +02:00
Markus Lehtonen	47e7c47594	Send raw features over gRPC Enable transfer of raw features between nfd-worker and nfd-master.	2021-11-16 17:32:28 +02:00
Markus Lehtonen	d4d9a03732	grpc: extend the API to send raw features Enable transmitting the discovered "raw" features over the gRPC API. Extend pkg/api/feature with protobuf and gRPC code. In this, utilize go-to-protobuf from k8s code-generator for auto-generating the gRPC interface from golang code. The tool can be Installed with: go install k8s.io/code-generator/cmd/go-to-protobuf@v0.20.7 The auto-generated code is (re-)generated/updated with "make apigen".	2021-11-16 17:32:28 +02:00
Swati Sehgal	b444ef95a8	NFD-Topology-Updater: Bump NRT API to version v0.0.12 The NodeResourceTopology API has been made cluster scoped as in the current context a CR corresponds to a Node and since Node is a cluster scoped resource it makes sense to make NRT cluster scoped as well. Ref: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/18 Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-11-16 13:28:23 +00:00
Markus Lehtonen	dd92c9a9ce	pkg/api/feature: revert back to structs instead of pointers Less error prone, as no chance for a nil pointer dereference.	2021-11-11 17:56:55 +02:00
Markus Lehtonen	9bff4b3185	pkg/api/feature: generator functions with initial values Flavor the generator helper functions with arguments for specifying the set of features to put into the generated objects.	2021-11-09 13:40:35 +02:00
Markus Lehtonen	5de4d8857c	pkg/api/feature: use pointers of structs Make it easier to mutate the feature sets.	2021-11-09 12:15:38 +02:00
Markus Lehtonen	25711799f3	pkg/resourcemonitor: fix typo in comment	2021-11-05 16:42:49 +02:00
Artyom Lukianov	45062754fd	resourcemonitor: aggregate and provide the memory and hugepages information The Kuberenetes pod resource API now exposing the memory and hugepages information for guaranteed pods. We can use this information to update NodeResourceTopology resource with memory and hugepages data. Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-11-04 10:17:10 +02:00
Artyom Lukianov	a93b660f7c	utils: add methods to fetch NUMA nodes hugepages and memory capacity The methods are used during calculation of reserved memory for system workloads. The calcualation is `resourceCapacity - resourceAllocatable`. Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-11-04 10:14:51 +02:00
Markus Lehtonen	0b386981a6	pkg/nfd-master: fix linter errors in tests	2021-10-04 09:51:38 +03:00
Kubernetes Prow Robot	9cf732b64e	Merge pull request #602 from marquiz/devel/go-generate Utilize go generate	2021-09-21 06:16:24 -07:00
Kubernetes Prow Robot	064391f310	Merge pull request #601 from marquiz/devel/feature-source-interface source: introduce FeatureSource interface	2021-09-21 05:48:25 -07:00
Markus Lehtonen	51c0d70383	Update auto-generated code Generated by running "make generate".	2021-09-21 13:37:36 +03:00
Markus Lehtonen	9487fbeb18	Utilize go generate Use 'go generate' for auto-generating code. Drop the old 'mock' and 'apigen' makefile targets. Those are replaced with a single make generate which (re-)generates everything.	2021-09-21 13:36:37 +03:00
Swati Sehgal	a311719d1e	topologyupdater: Updates based on latest changes made to CRD API There have been recent changes made to the noderesourcetopology API storing the proto file generated using go-to-protobuf tool and this code inports the proto generated in the API in the topology-updater.proto The PRs corresponding to the changes are as follows: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/9 https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/13 Commands used to generate topology-updater.pb.go file: go install github.com/golang/protobuf/protoc-gen-go@v1.4.3 go mod vendor protoc --go_opt=paths=source_relative --go_out=plugins=grpc:. pkg/topologyupdater/topology-updater.proto -I. -Ivendor As part of implmentation of this patch, reserved (non-allocatable) CPUs are evaluated by performing a difference between all the CPUs on a system (determined by using ghw) and allocatable CPUs (determined by querying GetAllocatableResources podResource API endpoint). When aggregator creates the NUMA zones, it will skip the zone creation if there are no allocatable resources. In this update we creates those missing zone with zero allocatable/available resources so we won't have holes in the array of reported zones. Co-Authored-by: Talor Itzhak <titzhak@redhat.com> Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-09-21 10:48:10 +01:00
Swati Sehgal	832f82baaa	topologyupdater: Handle pods with devices and integral CPU requests For accounting we should consider all guaranteed pods with integral CPU requests and all the pods with device requests This patch ensures that pods are only considered for accounting disregarding non-guranteed pods without any device request. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-09-21 10:48:10 +01:00
Swati Sehgal	aa7ae9265c	topologyupdater: watch/consider only guaranteed pods for accounting - Files obtained after running make mock - Run `go get github.com/vektra/mockery` and make sure that mockery is in your $PATH - run `make mock` Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-09-21 10:48:10 +01:00
Francesco Romani	b4c92e4eed	topologyupdater: Bootstrap nfd-topology-updater in NFD - This patch allows to expose Resource Hardware Topology information through CRDs in Node Feature Discovery. - In order to do this we introduce another software component called nfd-topology-updater in addition to the already existing software components nfd-master and nfd-worker. - nfd-master was enhanced to communicate with nfd-topology-updater over gRPC followed by creation of CRs corresponding to the nodes in the cluster exposing resource hardware topology information of that node. - Pin kubernetes dependency to one that include pod resource implementation - This code is responsible for obtaining hardware information from the system as well as pod resource information from the Pod Resource API in order to determine the allocatable resource information for each NUMA zone. This information along with Costs for NUMA zones (obtained by reading NUMA distances) is gathered by nfd-topology-updater running on all the nodes of the cluster and propagate NUMA zone costs to master in order to populate that information in the CRs corresponding to the nodes. - We use GHW facilities for obtaining system information like CPUs, topology, NUMA distances etc. - This also includes updates made to Makefile and Dockerfile and Manifests for deploying nfd-topology-updater. - This patch includes unit tests - As part of the Topology Aware Scheduling work, this patch captures the configured Topology manager scope in addition to the Topology manager policy. Based on the value of both attribues a single string will be populated to the CRD. The string value will be on of the following {SingleNUMANodeContainerLevel, SingleNUMANodePodLevel, BestEffort, Restricted, None} Co-Authored-by: Artyom Lukianov <alukiano@redhat.com> Co-Authored-by: Francesco Romani <fromani@redhat.com> Co-Authored-by: Talor Itzhak <titzhak@redhat.com> Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-09-21 10:47:39 +01:00
Francesco Romani	00cc07da76	topologyupdater: gRPC API definition Setup the topologyupdater API for gRPC communication of nfd-topology-updater with master We generate pb.go file to reflect latest dependency changes using github.com/golang/protobuf/protoc-gen-go and generate grpc files via: `protoc pkg/topologyupdater/topology-updater.proto --go_out=plugins=grpc:.` Please refer to: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/blob/master/pkg/apis/topology/v1alpha1/types.go Co-Authored-by: Artyom Lukianov <alukiano@redhat.com> Co-Authored-by: Francesco Romani <fromani@redhat.com> Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-09-21 10:47:39 +01:00
Markus Lehtonen	852cf4b61d	source: introduce FeatureSource interface Specify a new interface for managing "raw" feature data. This is the first step to separate raw feature data from node labels. None of the feature sources implement this interface, yet. This patch unifies the data format of "raw" features by dividing them into three different basic types. - keys, a set of names without any associated values, e.g. CPUID flags or loaded kernel modules - values, a map of key-value pairs, for features with a single value, e.g. kernel config flags or os version - instances, a list of instances each of which has multiple attributes (key-value pairs of their own), e.g. PCI or USB devices The new feature data types are defined in a new "pkg/api/feature" package, catering decoupling and re-usability of code e.g. within future extentions of the NFD gRPC API. Rename the Discover() method of LabelSource interface to GetLabels().	2021-09-20 09:58:07 +03:00
Markus Lehtonen	81378a3235	source: make sources register themselves Implement new registration infrastructure under the "source" package. This change loosens the coupling between label sources and the nfd-worker, making it easier to refactor and move the code around. Also, create a separate interface (ConfigurableSource) for configurable feature sources in order to eliminate boilerplate code. Add safety checks to the sources that they actually implement the interfaces they should. In sake of consistency and predictability (of behavior) change all methods of the sources to use pointer receivers. Add simple unit tests for the new functionality and include source/... into make test target.	2021-09-15 18:41:37 +03:00
Markus Lehtonen	befa7e9796	source: rename FeatureSource to LabelSource Prepare for separating feature detection from label creation.	2021-09-13 22:48:33 +03:00
Kubernetes Prow Robot	189f86bec8	Merge pull request #548 from marquiz/devel/profile-ns nfd-master: allow profile.node.kubernetes.io label ns	2021-08-27 07:24:04 -07:00
Markus Lehtonen	112744bc50	nfd-worker: split out gRPC connection handling Refactor the worker code and split out gRPC client connection handling into a separate base type. The intent is to promote re-usability of code for other NFD clients, too.	2021-08-20 15:29:27 +03:00
Carlos Eduardo Arango Gutierrez	dece85b394	Add livenessProbe via grpc to nfd-master Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>	2021-08-18 10:23:10 -05:00
Markus Lehtonen	55bd633425	nfd-master: allow profile.node.kubernetes.io label ns Add a separate label namespace for profile labels, intended for user-specified higher level "meta features". Also sub-namespaces of this (i.e. <sub-ns>.profile.node.kubernetes.io) are allowed.	2021-08-10 19:39:59 +03:00
Markus Lehtonen	c3760fbbab	nfd-master: rename LabelNs to FeatureLabelNs	2021-08-10 19:13:08 +03:00
Kubernetes Prow Robot	4a22a39928	Merge pull request #536 from marquiz/devel/label-sub-ns nfd-master: allow sub-namespaces of the default label ns	2021-08-10 04:19:18 -07:00
Markus Lehtonen	eb666f521d	nfd-master: allow sub-namespaces of the default label ns Allow <sub-ns>.feature.node.kubernetes.io label namespaces. Makes it possible to have e.g. vendor specific label ns without the need to user -extra-label-ns.	2021-08-10 11:41:52 +03:00
Markus Lehtonen	d12e62b1fe	Makefile: add apigen target For auto-generating api(s). Also, re-generate/refresh the gRPC with `make apigen` (with protoc v3.17.3 and protoc-gen-go from github.com/golang/protobuf v1.5.2) to sync up things.	2021-07-07 16:01:10 +03:00
Markus Lehtonen	a55783d533	Straighten wrinkles in lint fixes Fix small mistakes that slipped through with lint fixes (in `1230945564`).	2021-07-07 14:32:11 +03:00
Carlos Eduardo Arango Gutierrez	1230945564	make golint happy Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>	2021-06-14 12:27:58 -05:00
Carlos Eduardo Arango Gutierrez	894b7901ff	make gofmt happy by running gofmt -s Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>	2021-06-14 12:24:44 -05:00
Markus Lehtonen	99d223b029	utils/dump: do not print empty header line Makes log output cleaner.	2021-06-11 09:29:49 +03:00
robertdavidsmith	77bd4e4cf6	Accept client certs based on SAN, not just CN (#514 ) * first attempt at SAN-based VerifyNodeName * Update docs on verify-node-name	2021-04-20 01:44:32 -07:00
Kubernetes Prow Robot	c0e1000a7d	Merge pull request #474 from marquiz/devel/worker-log-verbosity nfd-worker: don't log labels returned by sources by default	2021-03-15 12:52:34 -07:00
Markus Lehtonen	6c6249a599	nfd-worker: don't log labels returned by sources by default Reduce default log verbosity. Only print out labels if log verbosity is 1 or higher ('core.klog.v: 1' config file option or '-v 1' on command line). Also, dump the labels in a reproducible (sorted) format.	2021-03-15 21:42:33 +02:00
Kubernetes Prow Robot	03f53d85e9	Merge pull request #475 from marquiz/devel/grpc-klog pkg/utils: show correct source file in gRPC logs	2021-03-11 06:20:24 -08:00
Markus Lehtonen	fb67a5027b	pkg/utils: show correct source file in gRPC logs Unwind two call frames so that the source (file:line) of the log message is correctly displayed.	2021-03-11 11:36:55 +02:00
Markus Lehtonen	8d67fc1122	pkg/utils: add dump functions A simple functions for pretty-printing and logging json-marshallable objects.	2021-03-11 07:12:22 +02:00
Markus Lehtonen	2d20a2ff7c	nfd-worker: support certificate rotation Watch for changes in TLS files and re-connect to nfd-master in the event of changes.	2021-03-09 14:40:51 +02:00
Markus Lehtonen	e771a35a21	nfd-master: support certificate rotation Add a helper/wrapper in pkg/utils to handle gRPC server-side certificate rotation.	2021-03-09 14:40:04 +02:00
Markus Lehtonen	dfc2596a22	pkg/utils: generalize file watcher Add the capability to watch multiple files. Move it to a separate package in order to make it reusable.	2021-03-09 14:20:34 +02:00
Markus Lehtonen	8af3a40ca7	logging: set grpc to use klog for logging	2021-03-05 14:44:44 +02:00
Markus Lehtonen	38d493aa67	pkg/utils: fix possible segfault in RegexpVal.Set	2021-03-02 22:46:34 +02:00
Markus Lehtonen	dd7691c486	nfd-worker: improve log messages of config handling	2021-03-02 18:49:58 +02:00
Carlos Eduardo Arango Gutierrez	389a8f87cf	logging: start log messages with lower case Standarize logs to be lower case. Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>	2021-03-01 10:07:21 -05:00
Markus Lehtonen	5e6f0779e9	nfd-worker: stop masking crashes in feature discovery The code should be stable enough. If there are fatal bugs causing the discovery to panic/segfault that should be made visible instead of semi-siently hiding it. Also, this caused one (negative) test case to fail undetected which is now fixed.	2021-03-01 09:14:19 +02:00
Markus Lehtonen	3f18e880b4	nfd-worker: dynamic configuration of klog Make it possible to dynamically (at run-time) alter most of the logging configuration from the config file.	2021-02-25 16:10:43 +02:00
Markus Lehtonen	7da7fde8f6	nfd-worker: switch to klog Greatly expands logging capabilities and flexibility with verbosity options, among other things.	2021-02-25 16:10:43 +02:00
Markus Lehtonen	3ffb7b8fc5	nfd-master: switch to klog	2021-02-25 07:50:37 +02:00
Markus Lehtonen	3fd61eacdb	nfd-worker: switch to flag in command line parsing	2021-02-24 12:06:16 +02:00
Markus Lehtonen	47033db9c1	nfd-master: use flag for command line parsing	2021-02-24 12:06:16 +02:00
Markus Lehtonen	6b744d4179	nfd-worker: extend unit test coverage of config handling Add test cases for verifying the core config. Also, add asynchronous tests for basic verification of dynamic config file updates.	2021-02-17 21:52:25 +02:00
Markus Lehtonen	2b24ed2c18	nfd-worker: implement Stop() method	2021-02-17 21:50:58 +02:00
Markus Lehtonen	278ccdb997	source/fake: make the fake source configurable Enables more flexible testing.	2021-02-17 21:50:58 +02:00
Markus Lehtonen	c2c9dff724	nfd-worker: bail out on invalid config file Changes the behaviour so that if the specified configuration file exists it must be valid. Error out at startup if the config is invalid. Similarly, exit with an error at runtime if the config file becomes invalid. Bailing out, instead of just printing an error, was a deliberate choice in order to make configuration mistakes evident. Having no configuration file is tolerated, however. If the specified configuration file does not exists nfd-worker resorts to default settings.	2021-02-17 21:42:50 +02:00
Markus Lehtonen	7e88f00e05	nfd-worker: add core.sources config option Add a config file option for controlling the enabled feature sources, aimed at replacing the --sources command line flag which is now marked as deprecated. The command line flag takes precedence over the config file option.	2021-02-17 21:36:20 +02:00
Markus Lehtonen	ed177350fc	nfd-worker: add core.labelWhiteList config option Add a config file option for label whitelisting. Deprecate the --label-whitelist command line flag. Note that the command line flag has higher priority than the config file option.	2021-02-17 21:35:44 +02:00
Markus Lehtonen	d1d8de944e	nfd-worker: add core.sleepInterval config option Add a new config file option for (dynamically) controlling the sleep interval. At the same time, deprecate the --sleep-interval command line flag. The command line flag takes precedence over the config file option.	2021-02-17 21:35:13 +02:00
Markus Lehtonen	e6bdc17d8c	nfd-worker: add core config Allows dynamic (re-)configuration of most nfd-worker options. The goal is to have most configuration parameters specified in the configuration file and deprecate most of the command line flags. The priority is intended to be such that command line flags override whatever is specified in the configuration file. Thus, specifying something on the command line effectively disables dynamic configurability of that parameter. This patch adds core.noPublish config file option to demonstrate how the new mechanism is supposed to work. The --no-publish command line flag takes precedence over this config file option.	2021-02-17 21:35:12 +02:00
Kubernetes Prow Robot	85bde7f749	Merge pull request #431 from marquiz/devel/master-instance-flag nfd-master: implement --instance flag	2021-02-11 02:40:15 -08:00
Markus Lehtonen	29910464a0	nfd-worker: always re-label after a re-config event Always do re-discovery and re-labeling after a configuration file change. his way the new config comes into effect immediately, even if the sleep interval is long (or infinite) # Please enter the commit message for your changes. Lines starting	2021-02-10 22:09:27 +02:00
Markus Lehtonen	b6ff514853	nfd-worker: use fsnotify for watching for config file changes Add support for detecting configuration file changes via file system notifications (fsnotify). Watches are added for the whole directory chain (up to root directory) so that all changes (even directory renames) affecting the given configuration file path are captured. Previously dynamic (re-)configuration of nfd-worker was implemented by (re-)reading the configuration file on every labeling pass. This was simple and effective, even if a bit wasteful. However, it didn't provide asynchronous configuration updates that will be required for e.g. controlling the "sleep-interval" parameter dynamically which will be implemented by later patches.	2021-02-10 22:09:27 +02:00
Markus Lehtonen	6958a6677f	nfd-worker: use timer channel for sleep interval	2021-02-10 22:09:27 +02:00
Markus Lehtonen	e52ec3480f	nfd-master: implement --instance flag This can be used to help running multiple parallel NFD deployments in the same cluster. The flag changes the node annotation namespace to <instance>.nfd.node.kubernetes.io allowing different nfd-master intances to store metadata in separate annotations.	2021-02-10 13:48:31 +02:00
Markus Lehtonen	705687192d	nfd-master: make updateNodeFeatures a method of nfdMaster	2021-02-10 13:46:59 +02:00
Markus Lehtonen	cdca6d667a	nfd-master: make nodeName non-global	2021-02-10 13:46:59 +02:00
Markus Lehtonen	b146508e64	nfd-master: drop separate labelerServer type Simplify code by changing nfdMaster to implement LabelerServer interface by itself.	2021-02-10 13:46:59 +02:00
Markus Lehtonen	76b95b6c55	Replace improper usage of filepath.Join with path.Join In JSON and kubernetes API object names we want to use slashes instead of the OS dependent file path separator.	2021-02-10 12:54:31 +02:00
Markus Lehtonen	19b8f2cd3d	nfd-master: more detailed unit testing of extended resources	2020-11-24 12:45:06 +02:00
Markus Lehtonen	d17743a0b9	nfd-master: handle label annotations in the same func Handle both creation and parsing of the "feature-labels" and "extended-resources" annotations in the function. I think this is more logical to keep them together.	2020-11-24 12:45:06 +02:00
Markus Lehtonen	95ff300d74	nfd-master: patch node object instead of rewriting it When updating node labels and annotations use JSON patches instead of doing a read-modify-write on the whole node object. Patching is already being used in managing extended resources so some of the existing code was re-usable. This patch should mitigate the problem of node update failures caused by race conditions (a change in the node object between our read and write) resulting e.g. in errors/restarts in nfd worker pods.	2020-11-24 12:45:06 +02:00
Markus Lehtonen	1ea301d272	nfd-master: change statusOp to a more generalized JSON patch Generalize and rename 'statusOp' type to a more flexible 'JsonPatch'. Move it to the apihelper package.	2020-11-24 12:45:06 +02:00
Markus Lehtonen	bb1e4c60fb	nfd-master: use namespaced label and annotation names internally For historical reasons the labels in the default nfd namespace have been internally represented without the namespace part. I.e. instead of "feature.node.kubernetes.io/foo" we just use "foo". NFD worker uses this representation, too, both internally and over the gRPC requests. The same scheme has been used for annotations. This patch changes NFD master to use fully namespaced label and annotation names internally. This hopefully makes the code a bit more understandable. It also addresses some corner cases making the handling of label names consistent, making it possible to use both "truncated" and fully namespaced names over the gRPC interface (and in the annotations).	2020-11-24 12:45:06 +02:00
Markus Lehtonen	29cbb2429c	nfd-worker: add special handling for --sources=all A new special value 'all' is a shortcut for enabling all feature sources. It should be the only name specified -- if any other names are specified 'all' does not take effect, but, we only enable the listed feature sources. E.g. --sources=all enables all sources, but --sources=all,cpu only enables the cpu source Also, print a warning if unknown sources are specified.	2020-11-20 16:23:53 +02:00
Artyom Lukianov	f363ba0e92	Update e2e test to work with updated dependencies Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2020-11-18 13:09:13 +02:00
Markus Lehtonen	458dd8dc58	nfd-master: add --kubeconfig flag Useful with --prune and for development purposes.	2020-09-07 07:51:42 +03:00
Markus Lehtonen	4669770020	nfd-master: implement --prune flag A new sub-command like flag for cleaning up a cluster. When --prune is specified nfd-master removes all NFD related labels, annotations and extended resources from all nodes of the cluster and exits. This should help undeployment of NFD and be useful for development.	2020-09-07 07:51:42 +03:00
Markus Lehtonen	6869a99ceb	nfd-master: fix one docstring	2020-09-07 07:51:42 +03:00
Markus Lehtonen	9e813a559c	nfd-worker: reload config on each re-discovery pass Dumb re-read/re-parse of the configuration file on every round of discoery. Probably not the most elegant solution to watch for config file changes, but, it works and doesn't cost much overhead.	2020-05-21 00:59:39 +03:00
Markus Lehtonen	a2b9df5cd3	nfd-worker: rework configuration handling Extend the FeatureSource interface with new methods for configuration handling. This enables easier on-the fly reconfiguration of the feature sources. Further, it simplifies adding config support to feature sources in the future. Stub methods are added to sources that do not currently have any configurability. The patch fixes some (corner) cases with the overrides (--options) handling, too: - Overrides were not applied if config file was missing or its parsing failed - Overrides for a certain source did not have effect if an empty config for the source was specified in the config file. This was caused by the first pass of parsing (config file) setting a nil pointer to the source-specific config, effectively detaching it from the main config. The second pass would then create a new instance of the source specific config, but, this was not visible in the feature source, of course.	2020-05-21 00:59:37 +03:00
Markus Lehtonen	c95ad3198c	nfd-worker: refactor handling of enabled sources and labels Make the list of enabled sources and the label whitelist regexp members of the nfdWorker instance. Get rid of the not-that-well-defined configureParameters() function.	2020-05-21 00:48:21 +03:00
Markus Lehtonen	818fc4cc70	nfd-worker: fix --label-whitelist Unify handling of --label-whitelist in nfd-worker and nfd-master. That is, in nfd-worker, apply the regexp filter on non-namespaced part of the label name. Brief history: 1. Originally the whitelist regexp was applied on the full namespaced label name (that would be e.g. 'feature.node.kubernetes.io/cpu-cpuid.AVX' in the current nfd version) 2. Commit `81752b2d` changed the behavior so that the regexp was applied on the non-namespaced part (that would be `cpu-cpuid.AVX`) 3. Commit `40918827` added support for custom label namespaces. With this change, the label whitelist handling diverged between nfd-worker and nfd-master. In nfd-master the whitelist regexp is always applied on the non-namespaced label name. However, in nfd-worker the whitelist handling is two-fold (and inconsistent): for labels in the standard nfd namespace regexp is applied on the non-namespaced part (e.g. `cpu-cpuid.AVX`, but, for labels in custom namespaces the regexp is applied on the full name (e.g. `example.com/my-feature`). This patch changes nfd-worker to behave similarly to nfd-master. The namespace part is now always omitted, which should be easier for the users to comprehend. Also, fixes a bug in the label name prefixing so that the name of the feature source is not prefixed into labels with custom label namespace (effectively mangling the intended namespace). For example, previously a 'example.com/feature' label from the 'custom' feature source would be prefixed with the source name, mangling it to 'custom-example.com/feature'.	2020-05-20 23:07:13 +03:00
Markus Lehtonen	a65d05bd9c	source/panic_fake: rename module to make lint happy	2020-05-20 21:48:06 +03:00
Markus Lehtonen	853609f721	nfd-master: lint fixes	2020-05-20 21:48:06 +03:00
Markus Lehtonen	523aa894a3	pkg/cpuid: lint fixes	2020-05-20 21:48:06 +03:00
Markus Lehtonen	c7b1d67b6b	nfd-worker: drop deprecated grpc.WithTimeout	2020-05-20 21:48:06 +03:00
Markus Lehtonen	91f3ddcc45	nfd-worker: lint fixes	2020-05-20 21:48:06 +03:00
Paul Mundt	c0ea69411b	usb: Add support for USB device discovery This builds on the PCI support to enable the discovery of USB devices. This is primarily intended to be used for the discovery of Edge-based heterogeneous accelerators that are connected via USB, such as the Coral USB Accelerator and the Intel NCS2 - our main motivation for adding this capability to NFD, and as part of our work in the SODALITE H2020 project. USB devices may define their base class at either the device or interface levels. In the case where no device class is set, the per-device interfaces are enumerated instead. USB devices may furthermore have multiple interfaces, which may or may not use the identical class across each interface. We therefore report device existence for each unique class definition to enable more fine-grained labelling and node selection. The default labelling format includes the class, vendor and device (product) IDs, as follows: feature.node.kubernetes.io/usb-fe_1a6e_089a.present=true As with PCI, a subset of device classes are whitelisted for matching. By default, there are only a subset of device classes under which accelerators tend to be mapped, which is used as the basis for the whitelist. These are: - Video - Miscellaneous - Application Specific - Vendor Specific For those interested in matching other classes, this may be extended by using the UsbId rule provided through the custom source. A full list of class codes is provided by the USB-IF at: https://www.usb.org/defined-class-codes For the moment, owing to a lack of a demonstrable use case, neither the subclass nor the protocol information are exposed. If this becomes necessary, support for these attributes can be trivially added. Signed-off-by: Paul Mundt <paul.mundt@adaptant.io>	2020-05-20 16:18:39 +02:00
Markus Lehtonen	409dc11389	Switch to sigs.k8s.io/yaml Replace github.com/ghodss/yaml.	2020-04-23 16:54:14 +03:00
Kubernetes Prow Robot	6d1aa73ca1	Merge pull request #298 from marquiz/devel/version version: allow undefined version	2020-03-24 09:46:48 -07:00
Kubernetes Prow Robot	7c4ff52a3c	Merge pull request #290 from adrianchiris/custom_features Support custom features	2020-03-24 08:26:48 -07:00
Markus Lehtonen	8c964b9daf	version: allow undefined version Just print a warning instead of exiting with an error if no version has been specified at build-time. This was pointless and just annoying at development time when doing builds with go directly.	2020-03-20 07:21:43 +02:00
Ukri Niemimuukko	903a939836	nfd-master: add extended resource support This adds support for making selected labels extended resources. Labels which have integer values, can be promoted to Kubernetes extended resources by listing them to the added command line flag `--resource-labels`. These labels won't then show in the node label section, they will appear only as extended resources. Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2020-03-19 13:19:22 +02:00
Adrian Chiris	192b3d7bdd	Add 'custom' feature Source to nfd-worker	2020-03-19 09:32:07 +02:00
Markus Lehtonen	54eaf16871	nfd-master: export label and annotation prefixes In order to be able to use the constants in end-to-end tests.	2020-02-27 14:21:00 +02:00
Markus Lehtonen	500a9e9b1a	apihelpers: use Clientset.CoreV1() Instead of the deprecated Clientset.Core().	2020-02-05 16:25:57 +02:00

... 2 3 4 5 6 ...

368 commits