mirror of
https://github.com/kubernetes-sigs/node-feature-discovery.git
synced 2025-03-15 21:08:23 +00:00
Merge pull request #526 from k8stopologyawareschedwg/topology-updater-documentation
Documentation capturing enablement of NFD-Topology-Updater in NFD
This commit is contained in:
commit
347b16daea
5 changed files with 526 additions and 9 deletions
|
@ -184,6 +184,8 @@ Usage of nfd-master:
|
||||||
Comma separated list of labels to be exposed as extended resources.
|
Comma separated list of labels to be exposed as extended resources.
|
||||||
-verify-node-name
|
-verify-node-name
|
||||||
Verify worker node name against the worker's TLS certificate. Only takes effect when TLS authentication has been enabled.
|
Verify worker node name against the worker's TLS certificate. Only takes effect when TLS authentication has been enabled.
|
||||||
|
-nrt-namespace
|
||||||
|
Namespace in which Node Resource Topology CR are created. Ensure that the namespace specified already exists
|
||||||
-version
|
-version
|
||||||
Print version and exit.
|
Print version and exit.
|
||||||
```
|
```
|
||||||
|
@ -242,6 +244,95 @@ stand-alone directly with `docker run`. See the
|
||||||
[default deployment](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/components/common/worker-mounts.yaml)
|
[default deployment](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/components/common/worker-mounts.yaml)
|
||||||
for up-to-date information about the required volume mounts.
|
for up-to-date information about the required volume mounts.
|
||||||
|
|
||||||
|
### NFD-Topology-Updater
|
||||||
|
|
||||||
|
In order to run nfd-topology-updater as a "stand-alone" container against your
|
||||||
|
standalone nfd-master you need to run them in the same network namespace:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
$ docker run --rm --network=container:nfd-test ${NFD_CONTAINER_IMAGE} nfd-topology-updater
|
||||||
|
2019/02/01 14:48:56 Node Feature Discovery Topology Updater <NFD_VERSION>
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
If you just want to try out feature discovery without connecting to nfd-master,
|
||||||
|
pass the `-no-publish` flag to nfd-topology-updater.
|
||||||
|
|
||||||
|
Command line flags of nfd-topology-updater:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
$ docker run --rm ${NFD_CONTAINER_IMAGE} nfd-topology-updater -help
|
||||||
|
docker run --rm quay.io/swsehgal/node-feature-discovery:v0.10.0-devel-64-g93a0a9f-dirty nfd-topology-updater -help
|
||||||
|
Usage of nfd-topology-updater:
|
||||||
|
-add_dir_header
|
||||||
|
If true, adds the file directory to the header of the log messages
|
||||||
|
-alsologtostderr
|
||||||
|
log to standard error as well as files
|
||||||
|
-ca-file string
|
||||||
|
Root certificate for verifying connections
|
||||||
|
-cert-file string
|
||||||
|
Certificate used for authenticating connections
|
||||||
|
-key-file string
|
||||||
|
Private key matching -cert-file
|
||||||
|
-kubeconfig string
|
||||||
|
Kube config file.
|
||||||
|
-kubelet-config-file string
|
||||||
|
Kubelet config file path. (default "/host-var/lib/kubelet/config.yaml")
|
||||||
|
-log_backtrace_at value
|
||||||
|
when logging hits line file:N, emit a stack trace
|
||||||
|
-log_dir string
|
||||||
|
If non-empty, write log files in this directory
|
||||||
|
-log_file string
|
||||||
|
If non-empty, use this log file
|
||||||
|
-log_file_max_size uint
|
||||||
|
Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
|
||||||
|
-logtostderr
|
||||||
|
log to standard error instead of files (default true)
|
||||||
|
-no-publish
|
||||||
|
Do not publish discovered features to the cluster-local Kubernetes API server.
|
||||||
|
-one_output
|
||||||
|
If true, only write logs to their native severity level (vs also writing to each lower severity level)
|
||||||
|
-oneshot
|
||||||
|
Update once and exit
|
||||||
|
-podresources-socket string
|
||||||
|
Pod Resource Socket path to use. (default "/host-var/lib/kubelet/pod-resources/kubelet.sock")
|
||||||
|
-server string
|
||||||
|
NFD server address to connecto to. (default "localhost:8080")
|
||||||
|
-server-name-override string
|
||||||
|
Hostname expected from server certificate, useful in testing
|
||||||
|
-skip_headers
|
||||||
|
If true, avoid header prefixes in the log messages
|
||||||
|
-skip_log_headers
|
||||||
|
If true, avoid headers when opening log files
|
||||||
|
-sleep-interval duration
|
||||||
|
Time to sleep between CR updates. Non-positive value implies no CR updatation (i.e. infinite sleep). [Default: 60s] (default 1m0s)
|
||||||
|
-stderrthreshold value
|
||||||
|
logs at or above this threshold go to stderr (default 2)
|
||||||
|
-v value
|
||||||
|
number for the log level verbosity
|
||||||
|
-version
|
||||||
|
Print version and exit.
|
||||||
|
-vmodule value
|
||||||
|
comma-separated list of pattern=N settings for file-filtered logging
|
||||||
|
-watch-namespace string
|
||||||
|
Namespace to watch pods (for testing/debugging purpose). Use * for all namespaces. (default "*")
|
||||||
|
```
|
||||||
|
|
||||||
|
NOTE:
|
||||||
|
|
||||||
|
NFD topology updater needs certain directories and/or files from the
|
||||||
|
host mounted inside the NFD container. Thus, you need to provide Docker with the
|
||||||
|
correct `--volume` options in order for them to work correctly when run
|
||||||
|
stand-alone directly with `docker run`. See the
|
||||||
|
[template spec](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/components/topology-updater/topologyupdater-mounts.yaml)
|
||||||
|
for up-to-date information about the required volume mounts.
|
||||||
|
|
||||||
|
[PodResource API][podresource-api] is a prerequisite for nfd-topology-updater.
|
||||||
|
Preceding Kubernetes v1.23, the `kubelet` must be started with the following flag:
|
||||||
|
`--feature-gates=KubeletPodResourcesGetAllocatable=true`.
|
||||||
|
Starting Kubernetes v1.23, the `GetAllocatableResources` is enabled by default
|
||||||
|
through `KubeletPodResourcesGetAllocatable` [feature gate][feature-gate].
|
||||||
|
|
||||||
## Documentation
|
## Documentation
|
||||||
|
|
||||||
All documentation resides under the
|
All documentation resides under the
|
||||||
|
@ -271,4 +362,6 @@ make site-build
|
||||||
This will generate html documentation under `docs/_site/`.
|
This will generate html documentation under `docs/_site/`.
|
||||||
|
|
||||||
<!-- Links -->
|
<!-- Links -->
|
||||||
[e2e-config-sample]: https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/test/e2e/e2e-test-config.example.yaml
|
[e2e-config-sample]: https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/test/e2e/e2e-test-config.exapmle.yaml
|
||||||
|
[podresource-api]: https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources
|
||||||
|
[feature-gate]: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates
|
||||||
|
|
197
docs/advanced/topology-updater-commandline-reference.md
Normal file
197
docs/advanced/topology-updater-commandline-reference.md
Normal file
|
@ -0,0 +1,197 @@
|
||||||
|
---
|
||||||
|
title: "Topology Updater Cmdline Reference"
|
||||||
|
layout: default
|
||||||
|
sort: 5
|
||||||
|
---
|
||||||
|
|
||||||
|
# NFD-Topology-Updater Commandline Flags
|
||||||
|
|
||||||
|
{: .no_toc }
|
||||||
|
|
||||||
|
## Table of Contents
|
||||||
|
|
||||||
|
{: .no_toc .text-delta }
|
||||||
|
|
||||||
|
1. TOC
|
||||||
|
{:toc}
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
To quickly view available command line flags execute `nfd-topology-updater -help`.
|
||||||
|
In a docker container:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker run gcr.io/k8s-staging-nfd/node-feature-discovery:master nfd-topology-updater -help
|
||||||
|
```
|
||||||
|
|
||||||
|
### -h, -help
|
||||||
|
|
||||||
|
Print usage and exit.
|
||||||
|
|
||||||
|
### -version
|
||||||
|
|
||||||
|
Print version and exit.
|
||||||
|
|
||||||
|
### -server
|
||||||
|
|
||||||
|
The `-server` flag specifies the address of the nfd-master endpoint where to
|
||||||
|
connect to.
|
||||||
|
|
||||||
|
Default: localhost:8080
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nfd-topology-updater -server=nfd-master.nfd.svc.cluster.local:443
|
||||||
|
```
|
||||||
|
|
||||||
|
### -ca-file
|
||||||
|
|
||||||
|
The `-ca-file` is one of the three flags (together with `-cert-file` and
|
||||||
|
`-key-file`) controlling the mutual TLS authentication on the topology-updater side.
|
||||||
|
This flag specifies the TLS root certificate that is used for verifying the
|
||||||
|
authenticity of nfd-master.
|
||||||
|
|
||||||
|
Default: *empty*
|
||||||
|
|
||||||
|
Note: Must be specified together with `-cert-file` and `-key-file`
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nfd-topology-updater -ca-file=/opt/nfd/ca.crt -cert-file=/opt/nfd/updater.crt -key-file=/opt/nfd/updater.key
|
||||||
|
```
|
||||||
|
|
||||||
|
### -cert-file
|
||||||
|
|
||||||
|
The `-cert-file` is one of the three flags (together with `-ca-file` and
|
||||||
|
`-key-file`) controlling mutual TLS authentication on the topology-updater
|
||||||
|
side. This flag specifies the TLS certificate presented for authenticating
|
||||||
|
outgoing requests.
|
||||||
|
|
||||||
|
Default: *empty*
|
||||||
|
|
||||||
|
Note: Must be specified together with `-ca-file` and `-key-file`
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nfd-topology-updater -cert-file=/opt/nfd/updater.crt -key-file=/opt/nfd/updater.key -ca-file=/opt/nfd/ca.crt
|
||||||
|
```
|
||||||
|
|
||||||
|
### -key-file
|
||||||
|
|
||||||
|
The `-key-file` is one of the three flags (together with `-ca-file` and
|
||||||
|
`-cert-file`) controlling the mutual TLS authentication on topology-updater
|
||||||
|
side. This flag specifies the private key corresponding the given certificate file
|
||||||
|
(`-cert-file`) that is used for authenticating outgoing requests.
|
||||||
|
|
||||||
|
Default: *empty*
|
||||||
|
|
||||||
|
Note: Must be specified together with `-cert-file` and `-ca-file`
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nfd-topology-updater -key-file=/opt/nfd/updater.key -cert-file=/opt/nfd/updater.crt -ca-file=/opt/nfd/ca.crt
|
||||||
|
```
|
||||||
|
|
||||||
|
### -server-name-override
|
||||||
|
|
||||||
|
The `-server-name-override` flag specifies the common name (CN) which to
|
||||||
|
expect from the nfd-master TLS certificate. This flag is mostly intended for
|
||||||
|
development and debugging purposes.
|
||||||
|
|
||||||
|
Default: *empty*
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nfd-topology-updater -server-name-override=localhost
|
||||||
|
```
|
||||||
|
|
||||||
|
### -no-publish
|
||||||
|
|
||||||
|
The `-no-publish` flag disables all communication with the nfd-master, making
|
||||||
|
it a "dry-run" flag for nfd-topology-updater. NFD-Topology-Updater runs
|
||||||
|
resource hardware topology detection normally, but no CR requests are sent to
|
||||||
|
nfd-master.
|
||||||
|
|
||||||
|
Default: *false*
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nfd-topology-updater -no-publish
|
||||||
|
```
|
||||||
|
|
||||||
|
### -oneshot
|
||||||
|
|
||||||
|
The `-oneshot` flag causes nfd-topology-updater to exit after one pass of
|
||||||
|
resource hardware topology detection.
|
||||||
|
|
||||||
|
Default: *false*
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nfd-topology-updater -oneshot -no-publish
|
||||||
|
```
|
||||||
|
|
||||||
|
### -sleep-interval
|
||||||
|
|
||||||
|
The `-sleep-interval` specifies the interval between resource hardware
|
||||||
|
topology re-examination (and CR updates). A non-positive value implies
|
||||||
|
infinite sleep interval, i.e. no re-detection is done.
|
||||||
|
|
||||||
|
Default: 60s
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nfd-topology-updater -sleep-interval=1h
|
||||||
|
```
|
||||||
|
|
||||||
|
### -watch-namespace
|
||||||
|
|
||||||
|
The `-watch-namespace` specifies the namespace to ensure that resource
|
||||||
|
hardware topology examination only happens for the pods running in the
|
||||||
|
specified namespace. Pods that are not running in the specified namespace
|
||||||
|
are not considered during resource accounting. This is particularly useful
|
||||||
|
for testing/debugging purpose. A "*" value would mean that all the pods would
|
||||||
|
be considered during the accounting process.
|
||||||
|
|
||||||
|
Default: "*"
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nfd-topology-updater -watch-namespace=rte
|
||||||
|
```
|
||||||
|
|
||||||
|
### -kubelet-config-file
|
||||||
|
|
||||||
|
The `-kubelet-config-file` specifies the path to the Kubelet's configuration
|
||||||
|
file.
|
||||||
|
|
||||||
|
Default: /host-var/lib/kubelet/config.yaml
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nfd-topology-updater -kubelet-config-file=/var/lib/kubelet/config.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
### -podresources-socket
|
||||||
|
|
||||||
|
The `-podresources-socket` specifies the path to the Unix socket where kubelet
|
||||||
|
exports a gRPC service to enable discovery of in-use CPUs and devices, and to
|
||||||
|
provide metadata for them.
|
||||||
|
|
||||||
|
Default: /host-var/liblib/kubelet/pod-resources/kubelet.sock
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nfd-topology-updater -podresources-socket=/var/lib/kubelet/pod-resources/kubelet.sock
|
||||||
|
```
|
|
@ -96,7 +96,11 @@ kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deplo
|
||||||
```
|
```
|
||||||
|
|
||||||
This will required RBAC rules and deploy nfd-master (as a deployment) and
|
This will required RBAC rules and deploy nfd-master (as a deployment) and
|
||||||
nfd-worker (as a daemonset) in the `node-feature-discovery` namespace.
|
nfd-worker (as daemonset) in the `node-feature-discovery` namespace.
|
||||||
|
|
||||||
|
**NOTE:** nfd-topology-updater is not deployed as part of the `default` overlay.
|
||||||
|
Please refer to the [Master Worker Topologyupdater](#master-worker-topologyupdater)
|
||||||
|
and [Topologyupdater](#topology-updater) below.
|
||||||
|
|
||||||
Alternatively you can clone the repository and customize the deployment by
|
Alternatively you can clone the repository and customize the deployment by
|
||||||
creating your own overlays. For example, to deploy the [minimal](#minimal)
|
creating your own overlays. For example, to deploy the [minimal](#minimal)
|
||||||
|
@ -115,6 +119,10 @@ scenarios under
|
||||||
see [Master-worker pod](#master-worker-pod) below
|
see [Master-worker pod](#master-worker-pod) below
|
||||||
- [`default-job`](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/overlays/default-job):
|
- [`default-job`](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/overlays/default-job):
|
||||||
see [Worker one-shot](#worker-one-shot) below
|
see [Worker one-shot](#worker-one-shot) below
|
||||||
|
- [`master-worker-topologyupdater`](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/overlays/master-worker-topologyupdater):
|
||||||
|
see [Master Worker Topologyupdater](#master-worker-topologyupdater) below
|
||||||
|
- [`topologyupdater`](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/overlays/topologyupdater):
|
||||||
|
see [Topology Updater](#topology-updater) below
|
||||||
- [`prune`](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/overlays/prune):
|
- [`prune`](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/overlays/prune):
|
||||||
clean up the cluster after uninstallation, see
|
clean up the cluster after uninstallation, see
|
||||||
[Removing feature labels](#removing-feature-labels)
|
[Removing feature labels](#removing-feature-labels)
|
||||||
|
@ -138,10 +146,14 @@ kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deplo
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
This creates a DaemonSet runs both nfd-worker and nfd-master in the same Pod.
|
This creates a DaemonSet that runs nfd-worker and nfd-master in the same Pod.
|
||||||
In this case no nfd-master is run on the master node(s), but, the worker nodes
|
In this case no nfd-master is run on the master node(s), but, the worker nodes
|
||||||
are able to label themselves which may be desirable e.g. in single-node setups.
|
are able to label themselves which may be desirable e.g. in single-node setups.
|
||||||
|
|
||||||
|
**NOTE:** nfd-topology-updater is not deployed by the default-combined overlay.
|
||||||
|
To enable nfd-topology-updater in this scenario,the users must customize the
|
||||||
|
deployment themselves.
|
||||||
|
|
||||||
#### Worker one-shot
|
#### Worker one-shot
|
||||||
|
|
||||||
Feature discovery can alternatively be configured as a one-shot job.
|
Feature discovery can alternatively be configured as a one-shot job.
|
||||||
|
@ -154,11 +166,44 @@ kubectl kustomize https://github.com/kubernetes-sigs/node-feature-discovery/depl
|
||||||
kubectl apply -f -
|
kubectl apply -f -
|
||||||
```
|
```
|
||||||
|
|
||||||
The example above launces as many jobs as there are non-master nodes. Note that
|
The example above launches as many jobs as there are non-master nodes. Note that
|
||||||
this approach does not guarantee running once on every node. For example,
|
this approach does not guarantee running once on every node. For example,
|
||||||
tainted, non-ready nodes or some other reasons in Job scheduling may cause some
|
tainted, non-ready nodes or some other reasons in Job scheduling may cause some
|
||||||
node(s) will run extra job instance(s) to satisfy the request.
|
node(s) will run extra job instance(s) to satisfy the request.
|
||||||
|
|
||||||
|
#### Master Worker Topologyupdater
|
||||||
|
|
||||||
|
NFD Master, NFD worker and NFD Topologyupdater can be configured to be deployed
|
||||||
|
as separate pods. The `master-worker-topologyupdater` overlay may be used to
|
||||||
|
achieve this:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/master-worker-topologyupdater?ref={{ site.release }}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Topologyupdater
|
||||||
|
|
||||||
|
In order to deploy just NFD master and NFD Topologyupdater (without nfd-worker)
|
||||||
|
use the `topologyupdater` overlay:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref={{ site.release }}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
NFD Topologyupdater can be configured along with the `default` overlay
|
||||||
|
(which deploys NFD worker and NFD master) where all the software components
|
||||||
|
are deployed as separate pods. The `topologyupdater` overlay may be used
|
||||||
|
along with `default` overlay to achieve this:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
|
||||||
|
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref={{ site.release }}
|
||||||
|
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref={{ site.release }}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
### Deployment with Helm
|
### Deployment with Helm
|
||||||
|
|
||||||
Node Feature Discovery Helm chart allow to easily deploy and manage NFD.
|
Node Feature Discovery Helm chart allow to easily deploy and manage NFD.
|
||||||
|
@ -350,6 +395,21 @@ The worker configuration file is watched and re-read on every change which
|
||||||
provides a simple mechanism of dynamic run-time reconfiguration. See
|
provides a simple mechanism of dynamic run-time reconfiguration. See
|
||||||
[worker configuration](#worker-configuration) for more details.
|
[worker configuration](#worker-configuration) for more details.
|
||||||
|
|
||||||
|
### NFD-Topology-Updater
|
||||||
|
|
||||||
|
NFD-Topology-Updater is preferably run as a Kubernetes DaemonSet. This assures
|
||||||
|
re-examination (and CR updates) on regular intervals capturing changes in
|
||||||
|
the allocated resources and hence the allocatable resources on a per zone
|
||||||
|
basis. It makes sure that more CR instances are created as new nodes get
|
||||||
|
added to the cluster. Topology-Updater connects to the nfd-master service
|
||||||
|
to create CR instances corresponding to nodes.
|
||||||
|
|
||||||
|
When run as a daemonset, nodes are re-examined for the allocated resources
|
||||||
|
(to determine the information of the allocatable resources on a per zone basis
|
||||||
|
where a zone can be a NUMA node) at an interval specified using the
|
||||||
|
`-sleep-interval` option. The default sleep interval is set to 60s which is the
|
||||||
|
the value when no -sleep-interval is specified.
|
||||||
|
|
||||||
### Communication security with TLS
|
### Communication security with TLS
|
||||||
|
|
||||||
NFD supports mutual TLS authentication between the nfd-master and nfd-worker
|
NFD supports mutual TLS authentication between the nfd-master and nfd-worker
|
||||||
|
|
|
@ -19,10 +19,11 @@ This software enables node feature discovery for Kubernetes. It detects
|
||||||
hardware features available on each node in a Kubernetes cluster, and
|
hardware features available on each node in a Kubernetes cluster, and
|
||||||
advertises those features using node labels.
|
advertises those features using node labels.
|
||||||
|
|
||||||
NFD consists of two software components:
|
NFD consists of three software components:
|
||||||
|
|
||||||
1. nfd-master
|
1. nfd-master
|
||||||
1. nfd-worker
|
1. nfd-worker
|
||||||
|
1. nfd-topology-updater
|
||||||
|
|
||||||
## NFD-Master
|
## NFD-Master
|
||||||
|
|
||||||
|
@ -36,7 +37,17 @@ NFD-Worker is a daemon responsible for feature detection. It then communicates
|
||||||
the information to nfd-master which does the actual node labeling. One
|
the information to nfd-master which does the actual node labeling. One
|
||||||
instance of nfd-worker is supposed to be running on each node of the cluster,
|
instance of nfd-worker is supposed to be running on each node of the cluster,
|
||||||
|
|
||||||
## Feature discovery
|
## NFD-Topology-Updater
|
||||||
|
|
||||||
|
NFD-Topology-Updater is a daemon responsible for examining allocated
|
||||||
|
resourceson a worker node to account for resources available to be allocated
|
||||||
|
to new pod on a per-zone basis (where a zone can be a NUMA node). It then
|
||||||
|
communicates the information to nfd-master which does the
|
||||||
|
[NodeResourceTopology CR](#noderesourcetopology-cr) creation corresponding
|
||||||
|
to all the nodes in the cluster. One instance of nfd-topology-updater is
|
||||||
|
supposed to be running on each node of the cluster.
|
||||||
|
|
||||||
|
## Feature Discovery
|
||||||
|
|
||||||
Feature discovery is divided into domain-specific feature sources:
|
Feature discovery is divided into domain-specific feature sources:
|
||||||
|
|
||||||
|
@ -93,4 +104,49 @@ command line flag affects the annotation names
|
||||||
Unapplicable annotations are not created, i.e. for example master.version is
|
Unapplicable annotations are not created, i.e. for example master.version is
|
||||||
only created on nodes running nfd-master.
|
only created on nodes running nfd-master.
|
||||||
|
|
||||||
|
## NodeResourceTopology CR
|
||||||
|
|
||||||
|
When run with NFD-Topology-Updater, NFD creates CR intances corresponding to
|
||||||
|
node resource hardware topology such as:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
apiVersion: topology.node.k8s.io/v1alpha1
|
||||||
|
kind: NodeResourceTopology
|
||||||
|
metadata:
|
||||||
|
name: node1
|
||||||
|
topologyPolicies: ["SingleNUMANodeContainerLevel"]
|
||||||
|
zones:
|
||||||
|
- name: node-0
|
||||||
|
type: Node
|
||||||
|
resources:
|
||||||
|
- name: cpu
|
||||||
|
capacity: 20
|
||||||
|
allocatable: 16
|
||||||
|
available: 10
|
||||||
|
- name: vendor/nic1
|
||||||
|
capacity: 3
|
||||||
|
allocatable: 3
|
||||||
|
available: 3
|
||||||
|
- name: node-1
|
||||||
|
type: Node
|
||||||
|
resources:
|
||||||
|
- name: cpu
|
||||||
|
capacity: 30
|
||||||
|
allocatable: 30
|
||||||
|
available: 15
|
||||||
|
- name: vendor/nic2
|
||||||
|
capacity: 6
|
||||||
|
allocatable: 6
|
||||||
|
available: 6
|
||||||
|
- name: node-2
|
||||||
|
type: Node
|
||||||
|
resources:
|
||||||
|
- name: cpu
|
||||||
|
capacity: 30
|
||||||
|
allocatable: 30
|
||||||
|
available: 15
|
||||||
|
- name: vendor/nic1
|
||||||
|
capacity: 3
|
||||||
|
allocatable: 3
|
||||||
|
available: 3
|
||||||
|
```
|
||||||
|
|
|
@ -19,14 +19,16 @@ kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deplo
|
||||||
|
|
||||||
## Verify
|
## Verify
|
||||||
|
|
||||||
Wait until NFD master and worker are running.
|
Wait until NFD master and NFD worker are running.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ kubectl -n node-feature-discovery get ds,deploy
|
$ kubectl -n node-feature-discovery get ds,deploy
|
||||||
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
|
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
|
||||||
daemonset.apps/nfd-worker 3 3 3 3 3 <none> 5s
|
daemonset.apps/nfd-worker 2 2 2 2 2 <none> 10s
|
||||||
|
|
||||||
NAME READY UP-TO-DATE AVAILABLE AGE
|
NAME READY UP-TO-DATE AVAILABLE AGE
|
||||||
deployment.apps/nfd-master 1/1 1 1 17s
|
deployment.apps/nfd-master 1/1 1 1 17s
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Check that NFD feature labels have been created
|
Check that NFD feature labels have been created
|
||||||
|
@ -71,3 +73,112 @@ $ kubectl get po feature-dependent-pod -o wide
|
||||||
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
|
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
|
||||||
feature-dependent-pod 1/1 Running 0 23s 10.36.0.4 node-2 <none> <none>
|
feature-dependent-pod 1/1 Running 0 23s 10.36.0.4 node-2 <none> <none>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Additional Optional Installation Steps
|
||||||
|
|
||||||
|
In order to deploy nfd-master and nfd-topology-updater daemons
|
||||||
|
use `topologyupdater` overlay.
|
||||||
|
|
||||||
|
Deploy with kustomize -- creates a new namespace, service and required RBAC
|
||||||
|
rules and nfd-master and nfd-topology-updater daemons.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref={{ site.release }}
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE:**
|
||||||
|
|
||||||
|
[PodResource API][podresource-api] is a prerequisite for nfd-topology-updater.
|
||||||
|
|
||||||
|
Preceding Kubernetes v1.23, the `kubelet` must be started with the following flag:
|
||||||
|
|
||||||
|
`--feature-gates=KubeletPodResourcesGetAllocatable=true`
|
||||||
|
|
||||||
|
Starting Kubernetes v1.23, the `GetAllocatableResources` is enabled by default
|
||||||
|
through `KubeletPodResourcesGetAllocatable` [feature gate][feature-gate].
|
||||||
|
|
||||||
|
## Verify
|
||||||
|
|
||||||
|
Wait until NFD master and NFD topologyupdater are running.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
$ kubectl -n node-feature-discovery get ds,deploy
|
||||||
|
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
|
||||||
|
daemonset.apps/nfd-topology-updater 2 2 2 2 2 <none> 5s
|
||||||
|
|
||||||
|
NAME READY UP-TO-DATE AVAILABLE AGE
|
||||||
|
deployment.apps/nfd-master 1/1 1 1 17s
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
Check that the NodeResourceTopology CR instances are created
|
||||||
|
|
||||||
|
```bash
|
||||||
|
$ kubectl get noderesourcetopologies.topology.node.k8s.io
|
||||||
|
NAME AGE
|
||||||
|
kind-control-plane 23s
|
||||||
|
kind-worker 23s
|
||||||
|
```
|
||||||
|
|
||||||
|
## Show the CR instances
|
||||||
|
|
||||||
|
```bash
|
||||||
|
$ kubectl describe noderesourcetopologies.topology.node.k8s.io kind-control-plane
|
||||||
|
Name: kind-control-plane
|
||||||
|
Namespace: default
|
||||||
|
Labels: <none>
|
||||||
|
Annotations: <none>
|
||||||
|
API Version: topology.node.k8s.io/v1alpha1
|
||||||
|
Kind: NodeResourceTopology
|
||||||
|
...
|
||||||
|
Topology Policies:
|
||||||
|
SingleNUMANodeContainerLevel
|
||||||
|
Zones:
|
||||||
|
Name: node-0
|
||||||
|
Costs:
|
||||||
|
node-0: 10
|
||||||
|
node-1: 20
|
||||||
|
Resources:
|
||||||
|
Name: Cpu
|
||||||
|
Allocatable: 3
|
||||||
|
Capacity: 3
|
||||||
|
Available: 3
|
||||||
|
Name: vendor/nic1
|
||||||
|
Allocatable: 2
|
||||||
|
Capacity: 2
|
||||||
|
Available: 2
|
||||||
|
Name: vendor/nic2
|
||||||
|
Allocatable: 2
|
||||||
|
Capacity: 2
|
||||||
|
Available: 2
|
||||||
|
Type: Node
|
||||||
|
Name: node-1
|
||||||
|
Costs:
|
||||||
|
node-0: 20
|
||||||
|
node-1: 10
|
||||||
|
Resources:
|
||||||
|
Name: Cpu
|
||||||
|
Allocatable: 4
|
||||||
|
Capacity: 4
|
||||||
|
Available: 4
|
||||||
|
Name: vendor/nic1
|
||||||
|
Allocatable: 2
|
||||||
|
Capacity: 2
|
||||||
|
Available: 2
|
||||||
|
Name: vendor/nic2
|
||||||
|
Allocatable: 2
|
||||||
|
Capacity: 2
|
||||||
|
Available: 2
|
||||||
|
Type: Node
|
||||||
|
Events: <none>
|
||||||
|
```
|
||||||
|
|
||||||
|
The CR instances created can be used to gain insight into the allocatable
|
||||||
|
resources along with the granularity of those resources at a per-zone level
|
||||||
|
(represented by node-0 and node-1 in the above example) or can be used by an
|
||||||
|
external entity (e.g. topology-aware scheduler plugin) to take an action based
|
||||||
|
on the gathered information.
|
||||||
|
|
||||||
|
<!-- Links -->
|
||||||
|
[podresource-api]: https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources
|
||||||
|
[feature-gate]: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates
|
||||||
|
|
Loading…
Add table
Reference in a new issue