node-feature-discovery

mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00

Author	SHA1	Message	Date
Markus Lehtonen	6b2d10753f	nfd-master: re-try on node update failures Change the NFD API handler to re-try on node update failures. Will work around transient failures, making sure that failed nodes (i.e. nodes that we failed to update) don't need to wait for the 1 hour resync period before being tried again.	2023-04-13 16:30:31 +03:00
Markus Lehtonen	70ac19ea66	nfd-master: increase controller resync period to 1 hour Increase the NFD API controller resync period from 5 minutes to 1 hour. The resync causes nfd-master to replay all NodeFeature and NodeFeatureRule objects, being effectively a "big hammer reset all" button. This should only be needed as an "insurance" to fix labels et al in case they have been manually tampered (outside NFD) and against certain bugs in nfd itself. NFD is not supposed to manage anything fast-changing so 1 hour should be enough. This change only affects behavior when the NodeFeature API has been enabled (with -enable-nodefeature-api).	2023-04-12 16:38:47 +03:00
Kubernetes Prow Robot	ad07829d0a	Merge pull request #1099 from ArangoGutierrez/extended_resources_v2 Create extended resources with NodeFeatureRule	2023-04-07 08:09:15 -07:00
Fabiano Fidêncio	250aea4741	Create extended resources with NodeFeatureRule Add support for management of Extended Resources via the NodeFeatureRule CRD API. There are usage scenarios where users want to advertise features as extended resources instead of labels (or annotations). This patch enables the discovery of extended resources, via annotation and patch of node.status.capacity and node.status.allocatable. By using the NodeFeatureRule API. Co-authored-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com> Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com> Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>	2023-04-07 16:14:56 +02:00
Markus Lehtonen	f64c23968a	nfd-master: fix node update Update node status before node metadata. This fixes a problem where we lose track of NFD-managed extended resources in case patching node status fails. Previously we removed all labels and annotations (including the one listing our ERs) and only after that updated node status. If node status update failed we had lost the annotation but extended resources were still there, leaving them orphaned.	2023-04-06 22:04:35 +03:00
Markus Lehtonen	cc6c20ff5f	nfd-master: disallow unprefixed and kubernetes taints Disallow taints having a key with "kubernetes.io/" or "*.kubernetes.io/" prefix. This is a precaution to protect the user from messing up with the "official" well-known taints from Kubernetes itself. The only exception is that the "nfd.node.kubernetes.io/" prefix is allowed. However, there is one allowed NFD-specific namespace (and its sub-namespaces) i.e. "feature.node.kubernetes.io" under the kubernetes.io domain that can be used for NFD-managed taints. Also disallow unprefixed taint keys. We don't add a default prefix to unprefixed taints (like we do for labels) from NodeFeatureRules. This is to prevent unpleasant surprises to users that need to manage matching tolerations for their workloads.	2023-04-06 16:12:37 +03:00
AhmedGrati	3fff409f6d	Add master config file Similar to the nfd-worker, in this PR we want to support the dynamic run-time configurability through a config file for the nfd-master. We'll use a json or yaml configuration file along with the fsnotify in order to watch for changes in the config file. As a result, we're allowing dynamic control of logging params, allowed namespaces, extended resources, label whitelisting, and denied namespaces. Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-04-03 09:52:09 +01:00
AhmedGrati	b499799364	feat: add deny-label-ns flag which supports wildcard Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>	2023-02-15 09:47:00 +01:00
Carlos Eduardo Arango Gutierrez	9b3171bce2	nfd-master: always start gRPC server Don't register gRPC LabelServer when using the NodeFeature option, only turn the gRPC server on for Health and Readiness probes.	2023-01-16 19:33:15 +01:00
Markus Lehtonen	aa97105854	Add common utility function for getting node name	2022-12-23 09:50:15 +02:00
Markus Lehtonen	f5ae3fe2c7	Simplify usage of ObjectMeta fields No need to explicitly spell out ObjectMeta as it's embedded in the object types.	2022-12-19 17:40:10 +02:00
Kubernetes Prow Robot	28a5daa338	Merge pull request #999 from marquiz/fixes/nodefeature-missing nfd-master: update node if no NodeFeature objects are present	2022-12-19 00:39:44 -08:00
Markus Lehtonen	4c955ad72c	nfd-master: update node if no NodeFeature objects are present Correctly handle the case where no NodeFeature objects exist for certain node (and NodeFeature API has been enabled with -enable-nodefeature-api). In this case all the labels should be removed.	2022-12-19 10:22:04 +02:00
Markus Lehtonen	b9c09e6674	nfd-master: update all nodes at startup when NodeFeature API enabled We want to always update all nodes at startup. Without this patch we don't get any update event from the controller if no NodeFeature or NodeFeatureRule objects exist in the cluster. Thus all nodes would stay untouched whereas we really want to remove all labels from all nodes in this case.	2022-12-14 21:49:50 +02:00
Kubernetes Prow Robot	d1b314842c	Merge pull request #989 from marquiz/devel/nodefeature-multi-object nfd-master: handle multiple NodeFeature objects	2022-12-14 07:51:34 -08:00
Markus Lehtonen	740e3af681	nfd-master: implement ratelimiter for nfd api updates Implement a naive ratelimiter for node update events originating from the nfd API. We might get a ton of events in short interval. The simplest example is startup when we get a separate Add event for every NodeFeature and NodeFeatureRule object. Without rate limiting we run "update all nodes" separately for each NodeFeatureRule object, plus, we would run "update node X" separately for each NodeFeature object targeting node X. This is a huge amount of wasted work because in principle just running "update all nodes" once should be enough.	2022-12-14 15:45:43 +02:00
Markus Lehtonen	79ed747be8	nfd-master: handle multiple NodeFeature objects Implement handling of multiple NodeFeature objects by merging all objects (targeting a certain node) into one before processing the data. This patch implements MergeInto() methods for all required data types. With support for multiple NodeFeature objects per node, The "nfd api workflow" can be easily demonstrated and tested from the command line. Creating the folloiwing object (assuming node-n exists in the cluster): apiVersion: nfd.k8s-sigs.io/v1alpha1 kind: NodeFeature metadata: labels: nfd.node.kubernetes.io/node-name: node-n name: my-features-for-node-n spec: # Features for NodeFeatureRule matching features: flags: vendor.domain-a: elements: feature-x: {} attributes: vendor.domain-b: elements: feature-y: "foo" feature-z: "123" instances: vendor.domain-c: elements: - attributes: name: "elem-1" vendor: "acme" - attributes: name: "elem-2" vendor: "acme" # Labels to be created labels: vendor-feature.enabled: "true" vendor-setting.value: "100" will create two feature labes: feature.node.kubernetes.io/vendor-feature.enabled: "true" feature.node.kubernetes.io/vendor-setting.value: "100" In addition it will advertise hidden/raw features that can be used for custom rules in NodeFeatureRule objects. Now, creating a NodeFeatureRule object: apiVersion: nfd.k8s-sigs.io/v1alpha1 kind: NodeFeatureRule metadata: name: my-rule spec: rules: - name: "my feature rule" labels: "my-feature": "true" matchFeatures: - feature: vendor.domain-a matchExpressions: feature-x: {op: Exists} - feature: vendor.domain-c matchExpressions: vendor: {op: In, value: ["acme"]} will match the features in the NodeFeature object above and cause one more label to be created: feature.node.kubernetes.io/my-feature: "true"	2022-12-14 15:44:52 +02:00
Markus Lehtonen	9f0806593d	nfd-master: rename -featurerules-controller flag to -crd-controller Deprecate the '-featurerules-controller' command line flag as the name does not describe the functionality anymore: in practice it controls the CRD controller handling both NodeFeature and NodeFeatureRule objects. The patch introduces a duplicate, more generally named, flag '-crd-controller'. A warning is printed in the log if '-featurerules-controller' flag is encountered.	2022-12-14 10:23:45 +02:00
Markus Lehtonen	6ddd87e465	nfd-master: support NodeFeature objects Add initial support for handling NodeFeature objects. With this patch nfd-master watches NodeFeature objects in all namespaces and reacts to changes in any of these. The node which a certain NodeFeature object affects is determined by the "nfd.node.kubernetes.io/node-name" annotation of the object. When a NodeFeature object targeting certain node is changed, nfd-master needs to process all other objects targeting the same node, too, because there may be dependencies between them. Add a new command line flag for selecting between gRPC and NodeFeature CRD API as the source of feature requests. Enabling NodeFeature API disables the gRPC interface. -enable-nodefeature-api enable NodeFeature CRD API for incoming feature requests, will disable the gRPC interface (defaults to false) It is not possible to serve gRPC and watch NodeFeature objects at the same time. This is deliberate to avoid labeling races e.g. by nfd-worker sending gRPC requests but NodeFeature objects in the cluster "overriding" those changes (labels from the gRPC requests will get overridden when NodeFeature objects are processed).	2022-12-14 07:31:28 +02:00
Markus Lehtonen	079655b42c	nfd-master: add error checking for CRD controller creation	2022-12-14 00:27:27 +02:00
Feruzjon Muyassarov	b296bdf0b3	update test functions according to upstream deprecated/removed methods Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-12-13 12:12:50 +02:00
Markus Lehtonen	f13ed2d91c	nfd-topology-updater: update NodeResourceTopology objects directly Drop the gRPC communication to nfd-master and connect to the Kubernetes API server directly when updating NodeResourceTopology objects. Topology-updater already has connection to the API server for listing Pods so this is not that dramatic change. It also simplifies the code a lot as there is no need for the NFD gRPC client and no need for managing TLS certs/keys. This change aligns nfd-topology-updater with the future direction of nfd-worker where the gRPC API is being dropped and replaced by a CRD-based API. This patch also update deployment files and documentation to reflect this change.	2022-12-08 11:03:22 +02:00
Feruzjon Muyassarov	2bdf427b89	nfd-master logic update for setting node taints This commits extends NFD master code to support adding node taints from NodeFeatureRule CR. We also introduce a new annotation for taints which helps to identify if the taint set on node is owned by NFD or not. When user deletes the taint entry from NodeFeatureRule CR, NFD will remove the taint from the node. But to avoid accidental deletion of taints not owned by the NFD, it needs to know the owner. Keeping track of NFD set taints in the annotation can be used during the filtering of the owner. Also enable-taints flag is added to allow users opt in/out for node tainting feature. The flag takes precedence over taints defined in NodeFeatureRule CR. In other words, if enbale-taints is set to false(disabled) and user still defines taints on the CR, NFD will ignore those taints and skip them from setting on the node. Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-12-02 17:25:00 +02:00
Feruzjon Muyassarov	7ea0e0b0a7	Add argument to updateNodeFeatures method to pass client from caller This commit adds an argument to updateNodeFeatures method for receiving client argument, which currently gets initialized within the method itself. This is a minor improvement for https://github.com/kubernetes-sigs/node-feature-discovery/pull/910. Ref:https://github.com/kubernetes-sigs/node-feature-discovery/pull/910#discussion_r1012703631 Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-11-06 22:37:11 +02:00
Markus Lehtonen	b907d07d7e	apis/nfd: flatten the structure of features data type Flatten the data structure that stores features, dropping the "domain" level from the data model. That extra level of hierarchy brought little benefit but just caused some extra complexity, instead. The new structure nicely matches what we have in the NodeFeatureRule object (the matchFeatures field of uses the same flat structure with the "feature" field having a value <domain>.<feature>, e.g. "kernel.version"). This is pre-work for introducing a new "node feature" CRD that contains the raw feature data. It makes the life of both users and developers easier when both CRDs, plus our internal code, handle feature data in a similar flat structure.	2022-10-18 18:37:28 +03:00
Markus Lehtonen	0e1d4a9046	apis/nfd: migrate pkg/api/feature Move the previously-protobuf-only internal "feature api" over to the public "nfd api" package. This is in preparation for introducing a new CRD API for communicating features. This patch carries no functional changes. Just moving code around.	2022-10-15 07:42:20 +03:00
Feruzjon Muyassarov	71434a1392	Standardize "k8s.io/api/core/v1" package short name Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>	2022-10-15 02:22:41 +03:00
Markus Lehtonen	edcaf9a3bb	nfd-master: refactor gRPC into a separate method Refactor the code so that the initialization and running of the gRPC server is done in a separate function. The goal is to make the code more maintainable in terms of disabling (and eventually removing) the gRPC functionality in the future.	2022-10-07 14:45:44 +03:00
Kubernetes Prow Robot	4097198848	Merge pull request #908 from marquiz/devel/type-rename pkg/api/feature: rename types	2022-10-06 01:59:51 -07:00
Markus Lehtonen	abdbd420d1	pkg/api/feature: rename types Sync type names with NFD documentation. Aims at making the codebase easier to follow.	2022-10-06 11:25:01 +03:00
Markus Lehtonen	c1e6b41e56	apis/nfd: move annotation and label consts from nfd-master Move consts related to NFD annotations and labels from nfd-master to the api. Makes them more logically accessible for clients.	2022-10-06 11:23:56 +03:00
Markus Lehtonen	658ffaa6a5	nfd-master: rename crd controller Prepare for adding support for other nfd api objects. Just rename file and some symbols, no functional changes.	2022-10-04 20:23:24 +03:00
Markus Lehtonen	8b652ab8ec	nfd-master: log if node was modified (or not) Be a bit more verbose what is happning.	2022-09-21 14:23:37 +03:00
Markus Lehtonen	389a3d4e2e	nfd-master: drop cleanup of ancient incubator labels Remove the cleanup code that removes ancient NFD labels with the node.alpha.kubernetes-incubator.io/ prefix. This label namespace was deprecated/dropped already in v0.4.0 so it should be safe to drop this code.	2022-09-20 19:56:58 +03:00
Markus Lehtonen	2c92e1dcff	logging: do not use %w with klog.Errorf It is not recognized (and does not work like with fmt.Errorf) so use %v instead.	2022-08-22 14:39:52 +03:00
Markus Lehtonen	889e4c1351	nfd-master: more fixes to log messages Use correct name for the CR (NodeFeatureRule) object. Also, the resource is cluster-scoped so don't print the namespace.	2022-08-17 10:07:26 +03:00
Markus Lehtonen	f5ee836bbf	nfd-master: fix incorrect log messages in crd controller	2022-08-16 16:39:27 +03:00
Dipto Chakrabarty	19a57789ad	Additional Lint Fixes in Codebase (#779 ) * fix comments and conditonals to fix lint issues * more linter fixes and spelling fixes * fix linter issues based on feedback	2022-03-02 17:12:46 -08:00
Kubernetes Prow Robot	0c330b1a35	Merge pull request #736 from marquiz/devel/grpc-stop nfd-master: do graceful stop of gRPC server	2022-01-21 03:05:59 -08:00
Markus Lehtonen	e53d053475	nfd-master: do graceful stop of gRPC server	2022-01-21 12:03:07 +02:00
Markus Lehtonen	e95a4dd460	nfd-master: print gRPC server error correctly	2022-01-21 11:56:28 +02:00
Kubernetes Prow Robot	86bfe74cd7	Merge pull request #671 from marquiz/fixes/single-dash-flags Use single-dash format of cmdline flags	2021-12-01 06:45:15 -08:00
Markus Lehtonen	a57a25f63c	Use single-dash format of cmdline flags Use the single-dash (i.e. '-option' instead of '--option') format consistently accross log messages and documentation. This is the format that was mostly used, already, and shown by command line help of the binaries, for example.	2021-11-25 18:03:54 +02:00
Markus Lehtonen	f75303ce43	pkg/apis/nfd: add variables to rule spec and support backreferences Support backreferencing of output values from previous rules. Enables complex rule setups where custom features are further combined together to form even more sophisticated higher level labels. The labels created by preceding rules are available as a special 'rule.matched' feature (for matchFeatures to use). If referencing rules accross multiple configs/CRDs care must be taken with the ordering. Processing order of rules in nfd-worker: 1. Static rules 2. Files from /etc/kubernetes/node-feature-discovery/custom.d/ in alphabetical order. Subdirectories are processed by reading their files in alphabetical order. 3. Custom rules from main nfd-worker.conf In nfd-master, NodeFeatureRule objects are processed in alphabetical order (based on their metadata.name). This patch also adds new 'vars' fields to the rule spec. Like 'labels', it is a map of key-value pairs but no labels are generated from these. The values specified in 'vars' are only added for backreferencing into the 'rules.matched' feature. This may by desired in schemes where the output of certain rules is only used as intermediate variables for other rules and no labels out of these are wanted. An example setup: - name: "kernel feature" labels: kernel-feature: matchFeatures: - feature: kernel.version matchExpressions: major: {op: Gt, value: ["4"]} - name: "intermediate var feature" vars: nolabel-feature: "true" matchFeatures: - feature: cpu.cpuid matchExpressions: AVX512F: {op: Exists} - feature: pci.device matchExpressions: vendor: {op: In, value: ["8086"]} device: {op: In, value: ["1234", "1235"]} - name: top-level-feature matchFeatures: - feature: rule.matched matchExpressions: kernel-feature: "true" nolabel-feature: "true"	2021-11-25 12:50:47 +02:00
Markus Lehtonen	33fdf75190	nfd-master: process labeling rules from CRs Enable Custom Resource based label creation in nfd-master. This extends the previously implemented controller stub for watching NodeFeatureRule objects. NFD-master watches NodeFeatureRule objects in the cluster and processes the rules on every incoming labeling request from workers. The functionality relies on the "raw features" (identical to how nfd-worker handles custom rules) submitted by nfd-worker, making it independent of the label source configuration of the worker. This means that the labeling functions as expected even if all sources in the worker are disabled. NOTE: nfd-master is stateless and re-labeling only happens on the reception of SetLabelsRequest from the worker – i.e. on intervals specified by the core.sleepInterval configuration option (or -sleep-interval cmdline flag) of each nfd-worker instance. This means that modification/creation of NodeFeatureRule objects does not automatically update the node labels. Instead, the changes only come visible when workers send their labeling requests.	2021-11-23 09:18:07 +02:00
Markus Lehtonen	e8872462dc	nfd-master: add -featurerules-controller flag Add a new command line flag for disabling/enabling the controller for NodeFeatureRule objects. In practice, disabling the controller disables all labels generated from rules in NodeFeatureRule objects.	2021-11-22 16:57:42 +02:00
Markus Lehtonen	e6e32a88c3	nfd-master: implement controller for NodeFeatureRule CRs Implement a simple controller stub that operates on NodeFeatureRule objects. The controller does not yet have any functionality other than logging changes in the (NodeFeatureRule) objecs it is watching. Also update the documentation on the -no-publish flag to match the new functionality.	2021-11-22 16:57:42 +02:00
Markus Lehtonen	237c4f7824	pkg/apihelpers: split out loading of kubeconfig to a separate function Make kubeconfig loading and parsing re-usable for multiple clients.	2021-11-22 16:57:42 +02:00
Markus Lehtonen	47e7c47594	Send raw features over gRPC Enable transfer of raw features between nfd-worker and nfd-master.	2021-11-16 17:32:28 +02:00
Swati Sehgal	b444ef95a8	NFD-Topology-Updater: Bump NRT API to version v0.0.12 The NodeResourceTopology API has been made cluster scoped as in the current context a CR corresponds to a Node and since Node is a cluster scoped resource it makes sense to make NRT cluster scoped as well. Ref: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/18 Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-11-16 13:28:23 +00:00
Markus Lehtonen	0b386981a6	pkg/nfd-master: fix linter errors in tests	2021-10-04 09:51:38 +03:00
Swati Sehgal	a311719d1e	topologyupdater: Updates based on latest changes made to CRD API There have been recent changes made to the noderesourcetopology API storing the proto file generated using go-to-protobuf tool and this code inports the proto generated in the API in the topology-updater.proto The PRs corresponding to the changes are as follows: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/9 https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/13 Commands used to generate topology-updater.pb.go file: go install github.com/golang/protobuf/protoc-gen-go@v1.4.3 go mod vendor protoc --go_opt=paths=source_relative --go_out=plugins=grpc:. pkg/topologyupdater/topology-updater.proto -I. -Ivendor As part of implmentation of this patch, reserved (non-allocatable) CPUs are evaluated by performing a difference between all the CPUs on a system (determined by using ghw) and allocatable CPUs (determined by querying GetAllocatableResources podResource API endpoint). When aggregator creates the NUMA zones, it will skip the zone creation if there are no allocatable resources. In this update we creates those missing zone with zero allocatable/available resources so we won't have holes in the array of reported zones. Co-Authored-by: Talor Itzhak <titzhak@redhat.com> Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-09-21 10:48:10 +01:00
Francesco Romani	b4c92e4eed	topologyupdater: Bootstrap nfd-topology-updater in NFD - This patch allows to expose Resource Hardware Topology information through CRDs in Node Feature Discovery. - In order to do this we introduce another software component called nfd-topology-updater in addition to the already existing software components nfd-master and nfd-worker. - nfd-master was enhanced to communicate with nfd-topology-updater over gRPC followed by creation of CRs corresponding to the nodes in the cluster exposing resource hardware topology information of that node. - Pin kubernetes dependency to one that include pod resource implementation - This code is responsible for obtaining hardware information from the system as well as pod resource information from the Pod Resource API in order to determine the allocatable resource information for each NUMA zone. This information along with Costs for NUMA zones (obtained by reading NUMA distances) is gathered by nfd-topology-updater running on all the nodes of the cluster and propagate NUMA zone costs to master in order to populate that information in the CRs corresponding to the nodes. - We use GHW facilities for obtaining system information like CPUs, topology, NUMA distances etc. - This also includes updates made to Makefile and Dockerfile and Manifests for deploying nfd-topology-updater. - This patch includes unit tests - As part of the Topology Aware Scheduling work, this patch captures the configured Topology manager scope in addition to the Topology manager policy. Based on the value of both attribues a single string will be populated to the CRD. The string value will be on of the following {SingleNUMANodeContainerLevel, SingleNUMANodePodLevel, BestEffort, Restricted, None} Co-Authored-by: Artyom Lukianov <alukiano@redhat.com> Co-Authored-by: Francesco Romani <fromani@redhat.com> Co-Authored-by: Talor Itzhak <titzhak@redhat.com> Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-09-21 10:47:39 +01:00
Kubernetes Prow Robot	189f86bec8	Merge pull request #548 from marquiz/devel/profile-ns nfd-master: allow profile.node.kubernetes.io label ns	2021-08-27 07:24:04 -07:00
Carlos Eduardo Arango Gutierrez	dece85b394	Add livenessProbe via grpc to nfd-master Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>	2021-08-18 10:23:10 -05:00
Markus Lehtonen	55bd633425	nfd-master: allow profile.node.kubernetes.io label ns Add a separate label namespace for profile labels, intended for user-specified higher level "meta features". Also sub-namespaces of this (i.e. <sub-ns>.profile.node.kubernetes.io) are allowed.	2021-08-10 19:39:59 +03:00
Markus Lehtonen	c3760fbbab	nfd-master: rename LabelNs to FeatureLabelNs	2021-08-10 19:13:08 +03:00
Kubernetes Prow Robot	4a22a39928	Merge pull request #536 from marquiz/devel/label-sub-ns nfd-master: allow sub-namespaces of the default label ns	2021-08-10 04:19:18 -07:00
Markus Lehtonen	eb666f521d	nfd-master: allow sub-namespaces of the default label ns Allow <sub-ns>.feature.node.kubernetes.io label namespaces. Makes it possible to have e.g. vendor specific label ns without the need to user -extra-label-ns.	2021-08-10 11:41:52 +03:00
Markus Lehtonen	a55783d533	Straighten wrinkles in lint fixes Fix small mistakes that slipped through with lint fixes (in `1230945564`).	2021-07-07 14:32:11 +03:00
Carlos Eduardo Arango Gutierrez	1230945564	make golint happy Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>	2021-06-14 12:27:58 -05:00
Carlos Eduardo Arango Gutierrez	894b7901ff	make gofmt happy by running gofmt -s Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>	2021-06-14 12:24:44 -05:00
robertdavidsmith	77bd4e4cf6	Accept client certs based on SAN, not just CN (#514 ) * first attempt at SAN-based VerifyNodeName * Update docs on verify-node-name	2021-04-20 01:44:32 -07:00
Markus Lehtonen	e771a35a21	nfd-master: support certificate rotation Add a helper/wrapper in pkg/utils to handle gRPC server-side certificate rotation.	2021-03-09 14:40:04 +02:00
Markus Lehtonen	3ffb7b8fc5	nfd-master: switch to klog	2021-02-25 07:50:37 +02:00
Markus Lehtonen	47033db9c1	nfd-master: use flag for command line parsing	2021-02-24 12:06:16 +02:00
Markus Lehtonen	e52ec3480f	nfd-master: implement --instance flag This can be used to help running multiple parallel NFD deployments in the same cluster. The flag changes the node annotation namespace to <instance>.nfd.node.kubernetes.io allowing different nfd-master intances to store metadata in separate annotations.	2021-02-10 13:48:31 +02:00
Markus Lehtonen	705687192d	nfd-master: make updateNodeFeatures a method of nfdMaster	2021-02-10 13:46:59 +02:00
Markus Lehtonen	cdca6d667a	nfd-master: make nodeName non-global	2021-02-10 13:46:59 +02:00
Markus Lehtonen	b146508e64	nfd-master: drop separate labelerServer type Simplify code by changing nfdMaster to implement LabelerServer interface by itself.	2021-02-10 13:46:59 +02:00
Markus Lehtonen	76b95b6c55	Replace improper usage of filepath.Join with path.Join In JSON and kubernetes API object names we want to use slashes instead of the OS dependent file path separator.	2021-02-10 12:54:31 +02:00
Markus Lehtonen	19b8f2cd3d	nfd-master: more detailed unit testing of extended resources	2020-11-24 12:45:06 +02:00
Markus Lehtonen	d17743a0b9	nfd-master: handle label annotations in the same func Handle both creation and parsing of the "feature-labels" and "extended-resources" annotations in the function. I think this is more logical to keep them together.	2020-11-24 12:45:06 +02:00
Markus Lehtonen	95ff300d74	nfd-master: patch node object instead of rewriting it When updating node labels and annotations use JSON patches instead of doing a read-modify-write on the whole node object. Patching is already being used in managing extended resources so some of the existing code was re-usable. This patch should mitigate the problem of node update failures caused by race conditions (a change in the node object between our read and write) resulting e.g. in errors/restarts in nfd worker pods.	2020-11-24 12:45:06 +02:00
Markus Lehtonen	1ea301d272	nfd-master: change statusOp to a more generalized JSON patch Generalize and rename 'statusOp' type to a more flexible 'JsonPatch'. Move it to the apihelper package.	2020-11-24 12:45:06 +02:00
Markus Lehtonen	bb1e4c60fb	nfd-master: use namespaced label and annotation names internally For historical reasons the labels in the default nfd namespace have been internally represented without the namespace part. I.e. instead of "feature.node.kubernetes.io/foo" we just use "foo". NFD worker uses this representation, too, both internally and over the gRPC requests. The same scheme has been used for annotations. This patch changes NFD master to use fully namespaced label and annotation names internally. This hopefully makes the code a bit more understandable. It also addresses some corner cases making the handling of label names consistent, making it possible to use both "truncated" and fully namespaced names over the gRPC interface (and in the annotations).	2020-11-24 12:45:06 +02:00
Markus Lehtonen	458dd8dc58	nfd-master: add --kubeconfig flag Useful with --prune and for development purposes.	2020-09-07 07:51:42 +03:00
Markus Lehtonen	4669770020	nfd-master: implement --prune flag A new sub-command like flag for cleaning up a cluster. When --prune is specified nfd-master removes all NFD related labels, annotations and extended resources from all nodes of the cluster and exits. This should help undeployment of NFD and be useful for development.	2020-09-07 07:51:42 +03:00
Markus Lehtonen	6869a99ceb	nfd-master: fix one docstring	2020-09-07 07:51:42 +03:00
Markus Lehtonen	853609f721	nfd-master: lint fixes	2020-05-20 21:48:06 +03:00
Ukri Niemimuukko	903a939836	nfd-master: add extended resource support This adds support for making selected labels extended resources. Labels which have integer values, can be promoted to Kubernetes extended resources by listing them to the added command line flag `--resource-labels`. These labels won't then show in the node label section, they will appear only as extended resources. Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>	2020-03-19 13:19:22 +02:00
Markus Lehtonen	54eaf16871	nfd-master: export label and annotation prefixes In order to be able to use the constants in end-to-end tests.	2020-02-27 14:21:00 +02:00
Jordan Jacobelli	40918827f6	Allow to change labels namespace The aim here is to allow to override the default namespace of NFD. The allowed namespaces are whitelisted. See https://github.com/kubernetes-sigs/node-feature-discovery/issues/227 Signed-off-by: Jordan Jacobelli <jjacobelli@nvidia.com>	2019-05-09 13:17:52 -07:00
Markus Lehtonen	470cf8dff2	nfd-master: correct a mistake in unit tests Annotations were not correctly checked when testing mockServer.updateNodeFeatures().	2019-05-08 23:07:52 +03:00
Markus Lehtonen	7f43a3db4e	nfd-master: fix --label-whitelist Make the --label-whitelist effective. Previously, it was unused and had no effect. Also, add simple unit test for that.	2019-05-08 23:07:52 +03:00
Markus Lehtonen	75a8f0c146	Refactor APIHelpers Remove functionality that was not interacting with Kubernetes API. Makes the architecture a bit simpler and simplifies testing.	2019-05-06 16:26:41 +03:00
Markus Lehtonen	35d26001e4	nfd-worker: extend unit test to cover 'main' Also, adds new method WaitForReady() into NfdMaster. In practice, this quite widely tests nfd-master, too, as the tests create an instance of NfdMaster and verify that the communication between master and worker works.	2019-05-06 16:26:41 +03:00
Markus Lehtonen	2de0a019a3	Move most of functionality in cmd/ to pkg/ Move most of the code under cmd/nfd-master and cmd/nfd-worker into new packages pkg/nfd-master and pk/nfd-worker, respectively. Makes extending unit tests to "main" functions easier.	2019-05-06 16:26:41 +03:00

1 2 3 4

188 commits