1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00

Merge pull request #368 from marquiz/devel/gh-pages

Migrate documentation from README to docs/
This commit is contained in:
Kubernetes Prow Robot 2020-10-29 12:18:06 -07:00 committed by GitHub
commit edebe815c9
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
22 changed files with 2072 additions and 1081 deletions

1061
README.md

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,21 @@
---
title: "Architecture"
layout: default
sort: 6
published: false
---
# Architecture
{: .no_toc }
## Table of contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
***WORK IN PROGRESS***
This page first gives an architectural overview and describes principles behind.

View file

@ -0,0 +1,23 @@
---
title: "Customization Guide"
layout: default
sort: 5
published: false
---
# Customization Guide
{: .no_toc }
## Table of Contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
***WORK IN PROGRESS***
This document explains with examples how to use hooks, feature files and the
custom feature source.

View file

@ -0,0 +1,304 @@
---
title: "Developer Guide"
layout: default
sort: 1
---
# Developer Guide
{: .no_toc }
## Table of contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
## Building from Source
### Download the source code
```bash
git clone https://github.com/kubernetes-sigs/node-feature-discovery
cd node-feature-discovery
```
### Docker Build
#### Build the container image
See [customizing the build](#customizing-the-build) below for altering the
container image registry, for example.
```bash
make
```
#### Push the container image
Optional, this example with Docker.
```bash
docker push <IMAGE_TAG>
```
#### Change the job spec to use your custom image (optional)
To use your published image from the step above instead of the
`k8s.gcr.io/nfd/node-feature-discovery` image, edit `image`
attribute in the spec template(s) to the new location
(`<registry-name>/<image-name>[:<version>]`).
### Deployment
The `yamls` makefile generates deployment specs matching your locally built
image. See [build customization](#customizing-the-build) below for
configurability, e.g. changing the deployment namespace.
```bash
K8S_NAMESPACE=my-ns make yamls
kubectl apply -f nfd-master.yaml
kubectl apply -f nfd-worker-daemonset.yaml
```
Alternatively, deploying worker and master in the same pod:
```bash
K8S_NAMESPACE=my-ns make yamls
kubectl apply -f nfd-master.yaml
kubectl apply -f nfd-daemonset-combined.yaml
```
Or worker as a one-shot job:
```bash
K8S_NAMESPACE=my-ns make yamls
kubectl apply -f nfd-master.yaml
NUM_NODES=$(kubectl get no -o jsonpath='{.items[*].metadata.name}' | wc -w)
sed s"/NUM_NODES/$NUM_NODES/" nfd-worker-job.yaml | kubectl apply -f -
```
### Building Locally
You can also build the binaries locally
```bash
make build
```
This will compile binaries under `bin/`
### Customizing the Build
There are several Makefile variables that control the build process and the
name of the resulting container image. The following are targeted targeted for
build customization and they can be specified via environment variables or
makefile overrides.
| Variable | Description | Default value
| -------------------------- | ----------------------------------------------------------------- | ----------- |
| HOSTMOUNT_PREFIX | Prefix of system directories for feature discovery (local builds) | / (*local builds*) /host- (*container builds*)
| IMAGE_BUILD_CMD | Command to build the image | docker build
| IMAGE_BUILD_EXTRA_OPTS | Extra options to pass to build command | *empty*
| IMAGE_PUSH_CMD | Command to push the image to remote registry | docker push
| IMAGE_REGISTRY | Container image registry to use | k8s.gcr.io/nfd
| IMAGE_TAG_NAME | Container image tag name | &lt;nfd version&gt;
| IMAGE_EXTRA_TAG_NAMES | Additional container image tag(s) to create when building image | *empty*
| K8S_NAMESPACE | nfd-master and nfd-worker namespace | kube-system
| KUBECONFIG | Kubeconfig for running e2e-tests | *empty*
| E2E_TEST_CONFIG | Parameterization file of e2e-tests (see [example](test/e2e/e2e-test-config.exapmle.yaml)) | *empty*
For example, to use a custom registry:
```bash
make IMAGE_REGISTRY=<my custom registry uri>
```
Or to specify a build tool different from Docker, It can be done in 2 ways:
1. via environment
```bash
IMAGE_BUILD_CMD="buildah bud" make
```
1. by overriding the variable value
```bash
make IMAGE_BUILD_CMD="buildah bud"
```
### Testing
Unit tests are automatically run as part of the container image build. You can
also run them manually in the source code tree by simply running:
```bash
make test
```
End-to-end tests are built on top of the e2e test framework of Kubernetes, and,
they required a cluster to run them on. For running the tests on your test
cluster you need to specify the kubeconfig to be used:
```bash
make e2e-test KUBECONFIG=$HOME/.kube/config
```
## Running Locally
You can run NFD locally, either directly on your host OS or in containers for
testing and development purposes. This may be useful e.g. for checking
features-detection.
### NFD-Master
When running as a standalone container labeling is expected to fail because
Kubernetes API is not available. Thus, it is recommended to use `--no-publish`
command line flag. E.g.
```bash
$ export NFD_CONTAINER_IMAGE=gcr.io/k8s-staging-nfd/node-feature-discovery:master
$ docker run --rm --name=nfd-test ${NFD_CONTAINER_IMAGE} nfd-master --no-publish
2019/02/01 14:48:21 Node Feature Discovery Master <NFD_VERSION>
2019/02/01 14:48:21 gRPC server serving on port: 8080
```
Command line flags of nfd-master:
```bash
$ docker run --rm ${NFD_CONTAINER_IMAGE} nfd-master --help
...
Usage:
nfd-master [--prune] [--no-publish] [--label-whitelist=<pattern>] [--port=<port>]
[--ca-file=<path>] [--cert-file=<path>] [--key-file=<path>]
[--verify-node-name] [--extra-label-ns=<list>] [--resource-labels=<list>]
[--kubeconfig=<path>]
nfd-master -h | --help
nfd-master --version
Options:
-h --help Show this screen.
--version Output version and exit.
--prune Prune all NFD related attributes from all nodes
of the cluster and exit.
--kubeconfig=<path> Kubeconfig to use [Default: ]
--port=<port> Port on which to listen for connections.
[Default: 8080]
--ca-file=<path> Root certificate for verifying connections
[Default: ]
--cert-file=<path> Certificate used for authenticating connections
[Default: ]
--key-file=<path> Private key matching --cert-file
[Default: ]
--verify-node-name Verify worker node name against CN from the TLS
certificate. Only has effect when TLS authentication
has been enabled.
--no-publish Do not publish feature labels
--label-whitelist=<pattern> Regular expression to filter label names to
publish to the Kubernetes API server.
NB: the label namespace is omitted i.e. the filter
is only applied to the name part after '/'.
[Default: ]
--extra-label-ns=<list> Comma separated list of allowed extra label namespaces
[Default: ]
--resource-labels=<list> Comma separated list of labels to be exposed as extended resources.
[Default: ]
```
### NFD-Worker
In order to run nfd-worker as a "stand-alone" container against your
standalone nfd-master you need to run them in the same network namespace:
```bash
$ docker run --rm --network=container:nfd-test ${NFD_CONTAINER_IMAGE} nfd-worker
2019/02/01 14:48:56 Node Feature Discovery Worker <NFD_VERSION>
...
```
If you just want to try out feature discovery without connecting to nfd-master,
pass the `--no-publish` flag to nfd-worker.
Command line flags of nfd-worker:
```bash
$ docker run --rm ${NFD_CONTAINER_IMAGE} nfd-worker --help
...
nfd-worker.
Usage:
nfd-worker [--no-publish] [--sources=<sources>] [--label-whitelist=<pattern>]
[--oneshot | --sleep-interval=<seconds>] [--config=<path>]
[--options=<config>] [--server=<server>] [--server-name-override=<name>]
[--ca-file=<path>] [--cert-file=<path>] [--key-file=<path>]
nfd-worker -h | --help
nfd-worker --version
Options:
-h --help Show this screen.
--version Output version and exit.
--config=<path> Config file to use.
[Default: /etc/kubernetes/node-feature-discovery/nfd-worker.conf]
--options=<config> Specify config options from command line. Config
options are specified in the same format as in the
config file (i.e. json or yaml). These options
will override settings read from the config file.
[Default: ]
--ca-file=<path> Root certificate for verifying connections
[Default: ]
--cert-file=<path> Certificate used for authenticating connections
[Default: ]
--key-file=<path> Private key matching --cert-file
[Default: ]
--server=<server> NFD server address to connecto to.
[Default: localhost:8080]
--server-name-override=<name> Name (CN) expect from server certificate, useful
in testing
[Default: ]
--sources=<sources> Comma separated list of feature sources.
[Default: cpu,custom,iommu,kernel,local,memory,network,pci,storage,system,usb]
--no-publish Do not publish discovered features to the
cluster-local Kubernetes API server.
--label-whitelist=<pattern> Regular expression to filter label names to
publish to the Kubernetes API server.
NB: the label namespace is omitted i.e. the filter
is only applied to the name part after '/'.
[Default: ]
--oneshot Label once and exit.
--sleep-interval=<seconds> Time to sleep between re-labeling. Non-positive
value implies no re-labeling (i.e. infinite
sleep). [Default: 60s]
```
**NOTE** Some feature sources need certain directories and/or files from the
host mounted inside the NFD container. Thus, you need to provide Docker with the
correct `--volume` options in order for them to work correctly when run
stand-alone directly with `docker run`. See the
[template spec](https://github.com/kubernetes-sigs/node-feature-discovery/blob/master/nfd-worker-daemonset.yaml.template)
for up-to-date information about the required volume mounts.
## Documentation
All documentation resides under the [docs](/docs) directory in the source tree.
It is designed to be served as a html site by [GitHub
Pages](https://pages.github.com/).
Building the documentation is containerized in order to fix the build
environment. The recommended way for developing documentation is to run:
```bash
make site-serve
```
This will build the documentation in a container and serve it under
[localhost:4000/](http://localhost:4000/) making it easy to verify the results.
Any changes made to the `docs/` will automatically re-trigger a rebuild and are
reflected in the served content and can be inspected with a simple browser
refresh.
In order to just build the html documentation run:
```bash
make site-build
```
This will generate html documentation under `docs/_site/`.

View file

@ -0,0 +1,21 @@
---
title: "E2E-Test Config Reference"
layout: default
sort: 7
published: false
---
# End-to-End Test Configuration File Reference
{: .no_toc }
## Table of contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
***WORK IN PROGRESS***
This section describes the end-to-end test configuration file.

9
docs/advanced/index.md Normal file
View file

@ -0,0 +1,9 @@
---
title: "Advanced"
layout: default
sort: 2
---
# Advanced
Advanced topics and reference.

View file

@ -0,0 +1,190 @@
---
title: "Master Cmdline Reference"
layout: default
sort: 2
---
# NFD-Master Commandline Flags
{: .no_toc }
## Table of Contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
To quickly view available command line flags execute `nfd-master --help`.
In a docker container:
```bash
docker run gcr.io/k8s-staging-nfd/node-feature-discovery:master nfd-master --help
```
### -h, --help
Print usage and exit.
### --version
Print version and exit.
### --prune
The `--prune` flag is a sub-command like option for cleaning up the cluster. It
causes nfd-master to remove all NFD related labels, annotations and extended
resources from all Node objects of the cluster and exit.
### --port
The `--port` flag specifies the TCP port that nfd-master listens for incoming requests.
Default: 8080
Example:
```bash
nfd-master --port=443
```
### --ca-file
The `--ca-file` is one of the three flags (together with `--cert-file` and
`--key-file`) controlling master-worker mutual TLS authentication on the
nfd-master side. This flag specifies the TLS root certificate that is used for
authenticating incoming connections. NFD-Worker side needs to have matching key
and cert files configured in order for the incoming requests to be accepted.
Default: *empty*
Note: Must be specified together with `--cert-file` and `--key-file`
Example:
```bash
nfd-master --ca-file=/opt/nfd/ca.crt --cert-file=/opt/nfd/master.crt --key-file=/opt/nfd/master.key
```
### --cert-file
The `--cert-file` is one of the three flags (together with `--ca-file` and
`--key-file`) controlling master-worker mutual TLS authentication on the
nfd-master side. This flag specifies the TLS certificate presented for
authenticating outgoing traffic towards nfd-worker.
Default: *empty*
Note: Must be specified together with `--ca-file` and `--key-file`
Example:
```bash
nfd-master --cert-file=/opt/nfd/master.crt --key-file=/opt/nfd/master.key --ca-file=/opt/nfd/ca.crt
```
### --key-file
The `--key-file` is one of the three flags (together with `--ca-file` and
`--cert-file`) controlling master-worker mutual TLS authentication on the
nfd-master side. This flag specifies the private key corresponding the given
certificate file (`--cert-file`) that is used for authenticating outgoing
traffic.
Default: *empty*
Note: Must be specified together with `--cert-file` and `--ca-file`
Example:
```bash
nfd-master --key-file=/opt/nfd/master.key --cert-file=/opt/nfd/master.crt --ca-file=/opt/nfd/ca.crt
```
### --verify-node-name
The `--verify-node-name` flag controls the NodeName based authorization of
incoming requests and only has effect when mTLS authentication has been enabled
(with `--ca-file`, `--cert-file` and `--key-file`). If enabled, the worker node
name of the incoming must match with the CN in its TLS certificate. Thus,
workers are only able to label the node they are running on (or the node whose
certificate they present), and, each worker must have an individual
certificate.
Node Name based authorization is disabled by default and thus it is possible
for all nfd-worker pods in the cluster to use one shared certificate, making
NFD deployment much easier.
Default: *false*
Example:
```bash
nfd-master --verify-node-name --ca-file=/opt/nfd/ca.crt \
--cert-file=/opt/nfd/master.crt --key-file=/opt/nfd/master.key
```
### --no-publish
The `--no-publish` flag disables all communication with the Kubernetes API
server, making a "dry-run" flag for nfd-master. No Labels, Annotations or
ExtendedResources (or any other properties of any Kubernetes API objects) are
modified.
Default: *false*
Example:
```bash
nfd-master --no-publish
```
### --label-whitelist
The `--label-whitelist` specifies a regular expression for filtering feature
labels based on their name. Each label must match against the given reqular
expression in order to be published.
Note: The regular expression is only matches against the "basename" part of the
label, i.e. to the part of the name after '/'. The label namespace is omitted.
Default: *empty*
Example:
```bash
nfd-master --label-whitelist='.*cpuid\.'
```
### --extra-label-ns
The `--extra-label-ns` flag specifies a comma-separated list of allowed feature
label namespaces. By default, nfd-master only allows creating labels in the
default `feature.node.kubernetes.io` label namespace. This option can be used
to allow vendor-specific namespaces for custom labels from the local and custom
feature sources.
The same namespace control and this flag applies Extended Resources (created
with `--resource-labels`), too.
Default: *empty*
Example:
```bash
nfd-master --extra-label-ns=vendor-1.com,vendor-2.io
```
### --resource-labels
The `--resource-labels` flag specifies a comma-separated list of features to be
advertised as extended resources instead of labels. Features that have integer
values can be published as Extended Resources by listing them in this flag.
Default: *empty*
Example:
```bash
nfd-master --resource-labels=vendor-1.com/feature-1,vendor-2.io/feature-2
```

View file

@ -0,0 +1,208 @@
---
title: "Worker Cmdline Reference"
layout: default
sort: 3
---
# NFD-Worker Commandline Flags
{: .no_toc }
## Table of Contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
To quickly view available command line flags execute `nfd-worker --help`.
In a docker container:
```bash
docker run gcr.io/k8s-staging-nfd/node-feature-discovery:master nfd-worker --help
```
### -h, --help
Print usage and exit.
### --version
Print version and exit.
### --config
The `--config` flag specifies the path of the nfd-worker configuration file to
use.
Default: /etc/kubernetes/node-feature-discovery/nfd-worker.conf
Example:
```bash
nfd-worker --config=/opt/nfd/worker.conf
```
### --options
The `--options` flag may be used to specify and override configuration file
options directly from the command line. The required format is the same as in
the config file i.e. JSON or YAML. Configuration options specified via this
flag will override those from the configuration file:
Default: *empty*
Example:
```bash
nfd-worker --options='{"sources":{"cpu":{"cpuid":{"attributeWhitelist":["AVX","AVX2"]}}}}'
```
### --server
The `--server` flag specifies the address of the nfd-master endpoint where to
connect to.
Default: localhost:8080
Example:
```bash
nfd-worker --server=nfd-master.nfd.svc.cluster.local:443
```
### --ca-file
The `--ca-file` is one of the three flags (together with `--cert-file` and
`--key-file`) controlling the mutual TLS authentication on the worker side.
This flag specifies the TLS root certificate that is used for verifying the
authenticity of nfd-master.
Default: *empty*
Note: Must be specified together with `--cert-file` and `--key-file`
Example:
```bash
nfd-worker --ca-file=/opt/nfd/ca.crt --cert-file=/opt/nfd/worker.crt --key-file=/opt/nfd/worker.key
```
### --cert-file
The `--cert-file` is one of the three flags (together with `--ca-file` and
`--key-file`) controlling mutual TLS authentication on the worker side. This
flag specifies the TLS certificate presented for authenticating outgoing
requests.
Default: *empty*
Note: Must be specified together with `--ca-file` and `--key-file`
Example:
```bash
nfd-workerr --cert-file=/opt/nfd/worker.crt --key-file=/opt/nfd/worker.key --ca-file=/opt/nfd/ca.crt
```
### --key-file
The `--key-file` is one of the three flags (together with `--ca-file` and
`--cert-file`) controlling the mutual TLS authentication on the worker side.
This flag specifies the private key corresponding the given certificate file
(`--cert-file`) that is used for authenticating outgoing requests.
Default: *empty*
Note: Must be specified together with `--cert-file` and `--ca-file`
Example:
```bash
nfd-worker --key-file=/opt/nfd/worker.key --cert-file=/opt/nfd/worker.crt --ca-file=/opt/nfd/ca.crt
```
### --server-name-override
The `--server-name-override` flag specifies the common name (CN) which to
expect from the nfd-master TLS certificate. This flag is mostly intended for
development and debugging purposes.
Default: *empty*
Example:
```bash
nfd-worker --server-name-override=localhost
```
### --sources
The `--sources` flag specifies a comma-separated list of enabled feature
sources.
Default: cpu,custom,iommu,kernel,local,memory,network,pci,storage,system,usb
Example:
```bash
nfd-worker --sources=kernel,system,local
```
### --no-publish
The `--no-publish` flag disables all communication with the nfd-master, making
it a "dry-run" flag for nfd-worker. NFD-Worker runs feature detection normally,
but no labeling requests are sent to nfd-master.
Default: *false*
Example:
```bash
nfd-worker --no-publish
```
### --label-whitelist
The `--label-whitelist` specifies a regular expression for filtering feature
labels based on their name. Each label must match against the given reqular
expression in order to be published.
Note: The regular expression is only matches against the "basename" part of the
label, i.e. to the part of the name after '/'. The label namespace is omitted.
Default: *empty*
Example:
```bash
nfd-worker --label-whitelist='.*cpuid\.'
```
### --oneshot
The `--oneshot` flag causes nfd-worker to exit after one pass of feature
detection.
Default: *false*
Example:
```bash
nfd-worker --oneshot --no-publish
```
### --sleep-interval
The `--sleep-interval` specifies the interval between feature re-detection (and
node re-labeling). A non-positive value implies infinite sleep interval, i.e.
no re-detection or re-labeling is done.
Default: 60s
Example:
```bash
nfd-worker --sleep-interval=1h
```

View file

@ -0,0 +1,22 @@
---
title: "Worker Config Reference"
layout: default
sort: 4
published: false
---
# NFD-Worker Configuration File Reference
{: .no_toc }
## Table of contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
***WORK IN PROGRESS***
This section is a reference to all the configuration settings in the worker
config file.

View file

@ -0,0 +1,34 @@
---
title: "Contributing"
layout: default
sort: 3
---
# Contributing
---
## Community
You can reach us via the following channels:
- [#node-feature-discovery](https://kubernetes.slack.com/messages/node-feature-discovery)
channel in [Kubernetes Slack](slack.k8s.io)
- [SIG-Node](https://groups.google.com/g/kubernetes-sig-node) mailing list
- File an
[issue](https://github.com/kubernetes-sigs/node-feature-discovery/issues/new)
in this repository
## Governance
This is a
[SIG-node](https://github.com/kubernetes/community/blob/master/sig-node/README.md)
subproject, hosted under the
[Kubernetes SIGs](https://github.com/kubernetes-sigs) organization in Github.
The project was established in 2016 as a
[Kubernetes Incubator](https://github.com/kubernetes/community/blob/master/incubator.md)
project and migrated to Kubernetes SIGs in 2018.
## License
This is open source software released under the [Apache 2.0 License](LICENSE).

View file

@ -0,0 +1,297 @@
---
title: "Deployment and Usage"
layout: default
sort: 3
---
# Deployment and Usage
{: .no_toc }
## Table of Contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
## Requirements
1. Linux (x86_64/Arm64/Arm)
1. [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl)
(properly set up and configured to work with your Kubernetes cluster)
## Deployment options
### Operator
Deployment using the
[Node Feature Discovery Operator][nfd-operator]
is recommended to be done via
[operatorhub.io](https://operatorhub.io/operator/nfd-operator).
1. You need to have
[OLM][OLM]
installed. If you don't, take a look at the
[latest release](https://github.com/operator-framework/operator-lifecycle-manager/releases/latest)
for detailed instructions.
1. Install the operator:
```bash
kubectl create -f https://operatorhub.io/install/nfd-operator.yaml
```
1. Create NodeFeatureDiscovery resource (in `nfd` namespace here):
```bash
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: nfd
---
apiVersion: nfd.kubernetes.io/v1alpha1
kind: NodeFeatureDiscovery
metadata:
name: my-nfd-deployment
namespace: nfd
EOF
```
### Deployment Templates
The template specs provided in the repo can be used directly:
```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/master/nfd-master.yaml.template
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/master/nfd-worker-daemonset.yaml.template
```
This will required RBAC rules and deploy nfd-master (as a deployment) and
nfd-worker (as a daemonset) in the `node-feature-discovery` namespace.
Alternatively you can download the templates and customize the deployment
manually.
#### Master-Worker Pod
You can also run nfd-master and nfd-worker inside the same pod
```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/master/nfd-daemonset-combined.yaml.template
```
This creates a DaemonSet runs both nfd-worker and nfd-master in the same Pod.
In this case no nfd-master is run on the master node(s), but, the worker nodes
are able to label themselves which may be desirable e.g. in single-node setups.
#### Worker One-shot
Feature discovery can alternatively be configured as a one-shot job.
The Job template may be used to achieve this:
```bash
NUM_NODES=$(kubectl get no -o jsonpath='{.items[*].metadata.name}' | wc -w)
curl -fs https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/master/nfd-worker-job.yaml.template | \
sed s"/NUM_NODES/$NUM_NODES/" | \
kubectl apply -f -
```
The example above launces as many jobs as there are non-master nodes. Note that
this approach does not guarantee running once on every node. For example,
tainted, non-ready nodes or some other reasons in Job scheduling may cause some
node(s) will run extra job instance(s) to satisfy the request.
### Build Your Own
If you want to use the latest development version (master branch) you need to
build your own custom image.
See the [Developer Guide](/advanced/developer-guide) for instructions how to
build images and deploy them on your cluster.
## Usage
### NFD-Master
NFD-Master runs as a deployment (with a replica count of 1), by default
it prefers running on the cluster's master nodes but will run on worker
nodes if no master nodes are found.
For High Availability, you should simply increase the replica count of
the deployment object. You should also look into adding
[inter-pod](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity)
affinity to prevent masters from running on the same node.
However note that inter-pod affinity is costly and is not recommended
in bigger clusters.
NFD-Master listens for connections from nfd-worker(s) and connects to the
Kubernetes API server to add node labels advertised by them.
If you have RBAC authorization enabled (as is the default e.g. with clusters
initialized with kubeadm) you need to configure the appropriate ClusterRoles,
ClusterRoleBindings and a ServiceAccount in order for NFD to create node
labels. The provided template will configure these for you.
### NFD-Worker
NFD-Worker is preferably run as a Kubernetes DaemonSet. This assures
re-labeling on regular intervals capturing changes in the system configuration
and mames sure that new nodes are labeled as they are added to the cluster.
Worker connects to the nfd-master service to advertise hardware features.
When run as a daemonset, nodes are re-labeled at an interval specified using
the `--sleep-interval` option. In the
[template](https://github.com/kubernetes-sigs/node-feature-discovery/blob/master/nfd-worker-daemonset.yaml.template#L26)
the default interval is set to 60s which is also the default when no
`--sleep-interval` is specified. Also, the configuration file is re-read on
each iteration providing a simple mechanism of run-time reconfiguration.
### TLS authentication
NFD supports mutual TLS authentication between the nfd-master and nfd-worker
instances. That is, nfd-worker and nfd-master both verify that the other end
presents a valid certificate.
TLS authentication is enabled by specifying `--ca-file`, `--key-file` and
`--cert-file` args, on both the nfd-master and nfd-worker instances.
The template specs provided with NFD contain (commented out) example
configuration for enabling TLS authentication.
The Common Name (CN) of the nfd-master certificate must match the DNS name of
the nfd-master Service of the cluster. By default, nfd-master only check that
the nfd-worker has been signed by the specified root certificate (--ca-file).
Additional hardening can be enabled by specifying --verify-node-name in
nfd-master args, in which case nfd-master verifies that the NodeName presented
by nfd-worker matches the Common Name (CN) of its certificate. This means that
each nfd-worker requires a individual node-specific TLS certificate.
## Configuration
NFD-Worker supports a configuration file. The default location is
`/etc/kubernetes/node-feature-discovery/nfd-worker.conf`, but,
this can be changed by specifying the`--config` command line flag.
Configuration file is re-read on each labeling pass (determined by
`--sleep-interval`) which makes run-time re-configuration of nfd-worker
possible.
Worker configuration file is read inside the container, and thus, Volumes and
VolumeMounts are needed to make your configuration available for NFD. The
preferred method is to use a ConfigMap which provides easy deployment and
re-configurability. For example, create a config map using the example config
as a template:
```bash
cp nfd-worker.conf.example nfd-worker.conf
vim nfd-worker.conf # edit the configuration
kubectl create configmap nfd-worker-config --from-file=nfd-worker.conf
```
Then, configure Volumes and VolumeMounts in the Pod spec (just the relevant
snippets shown below):
```yaml
...
containers:
volumeMounts:
- name: nfd-worker-config
mountPath: "/etc/kubernetes/node-feature-discovery/"
...
volumes:
- name: nfd-worker-config
configMap:
name: nfd-worker-config
...
```
You could also use other types of volumes, of course. That is, hostPath if
different config for different nodes would be required, for example.
The (empty-by-default)
[example config](https://github.com/kubernetes-sigs/node-feature-discovery/blob/master/nfd-worker.conf.example)
is used as a config in the NFD Docker image. Thus, this can be used as a default
configuration in custom-built images.
Configuration options can also be specified via the `--options` command line
flag, in which case no mounts need to be used. The same format as in the config
file must be used, i.e. JSON (or YAML). For example:
```
--options='{"sources": { "pci": { "deviceClassWhitelist": ["12"] } } }'
```
Configuration options specified from the command line will override those read
from the config file.
Currently, the only available configuration options are related to the
[CPU](#cpu-features), [PCI](#pci-features) and [Kernel](#kernel-features)
feature sources.
## Using Node Labels
Nodes with specific features can be targeted using the `nodeSelector` field. The
following example shows how to target nodes with Intel TurboBoost enabled.
```yaml
apiVersion: v1
kind: Pod
metadata:
labels:
env: test
name: golang-test
spec:
containers:
- image: golang
name: go1
nodeSelector:
feature.node.kubernetes.io/cpu-pstate.turbo: 'true'
```
For more details on targeting nodes, see
[node selection](https://kubernetes.io/docs/tasks/tools/install-kubectl).
## Uninstallation
### Operator Was Used for Deployment
If you followed the deployment instructions above you can simply do:
```bash
kubectl -n nfd delete NodeFeatureDiscovery my-nfd-deployment
```
Optionally, you can also remove the namespace:
```bash
kubectl delete ns nfd
```
See the [node-feature-discovery-operator][nfd-operator] and [OLM][OLM] project
documentation for instructions for uninstalling the operator and operator
lifecycle manager, respectively.
### Manual
```bash
NFD_NS=node-feature-discovery
kubectl -n $NFD_NS delete ds nfd-worker
kubectl -n $NFD_NS delete deploy nfd-master
kubectl -n $NFD_NS delete svc nfd-master
kubectl -n $NFD_NS delete sa nfd-master
kubectl delete clusterrole nfd-master
kubectl delete clusterrolebinding nfd-master
```
### Removing Feature Labels
NFD-Master has a special `--prune` command line flag for removing all
nfd-related node labels, annotations and extended resources from the cluster.
```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/master/nfd-prune.yaml.template
kubectl -n node-feature-discovery wait job.batch/nfd-prune --for=condition=complete && \
kubectl -n node-feature-discovery delete job/nfd-prune
```
**NOTE:** You must run prune before removing the RBAC rules (serviceaccount,
clusterrole and clusterrolebinding).
<!-- Links -->
[nfd-operator]: https://github.com/kubernetes-sigs/node-feature-discovery-operator
[OLM]: https://github.com/operator-framework/operator-lifecycle-manager

View file

@ -0,0 +1,30 @@
---
title: "Examples and Demos"
layout: default
sort: 5
---
# Examples And Demos
{: .no_toc }
## Table of Contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
This page contains usage examples and demos.
## Demos
### Usage demo
[![asciicast](https://asciinema.org/a/247316.svg)](https://asciinema.org/a/247316)
### Demo Use Case
A demo on the benefits of using node feature discovery can be found in the
source code repository under
[demo/](https://github.com/kubernetes-sigs/node-feature-discovery/tree/master/demo).

View file

@ -0,0 +1,631 @@
---
title: "Feature Discovery"
layout: default
sort: 4
---
# Feature Discovery
{: .no_toc }
## Table of Contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
Feature discovery in nfd-worker is performed by a set of separate modules
called feature sources. Most of them are specifically responsible for certain
domain of features (e.g. cpu). In addition there are two highly customizable
feature sources that work accross the system.
## Feature labels
Each discovered feature is advertised a label in the Kubernetes Node object.
The published node labels encode a few pieces of information:
- Namespace, (all built-in labels use `feature.node.kubernetes.io`)
- The source for each label (e.g. `cpu`).
- The name of the discovered feature as it appears in the underlying
source, (e.g. `cpuid.AESNI` from cpu).
- The value of the discovered feature.
Feature label names adhere to the following pattern:
```
<namespace>/<source name>-<feature name>[.<attribute name>]
```
The last component (i.e. `attribute-name`) is optional, and only used if a
feature logically has sub-hierarchy, e.g. `sriov.capable` and
`sriov.configure` from the `network` source.
The `--sources` flag controls which sources to use for discovery.
*Note: Consecutive runs of nfd-worker will update the labels on a
given node. If features are not discovered on a consecutive run, the corresponding
label will be removed. This includes any restrictions placed on the consecutive run,
such as restricting discovered features with the --label-whitelist option.*
## Feature Sources
### CPU
The **cpu** feature source supports the following labels:
| Feature name | Attribute | Description |
| ----------------------- | ------------------ | ----------------------------- |
| cpuid | &lt;cpuid flag&gt; | CPU capability is supported
| hardware_multithreading | | Hardware multithreading, such as Intel HTT, enabled (number of logical CPUs is greater than physical CPUs)
| power | sst_bf.enabled | Intel SST-BF ([Intel Speed Select Technology][intel-sst] - Base frequency) enabled
| [pstate][intel-pstate] | turbo | Set to 'true' if turbo frequencies are enabled in Intel pstate driver, set to 'false' if they have been disabled.
| [rdt][intel-rdt] | RDTMON | Intel RDT Monitoring Technology
| | RDTCMT | Intel Cache Monitoring (CMT)
| | RDTMBM | Intel Memory Bandwidth Monitoring (MBM)
| | RDTL3CA | Intel L3 Cache Allocation Technology
| | RDTL2CA | Intel L2 Cache Allocation Technology
| | RDTMBA | Intel Memory Bandwidth Allocation (MBA) Technology
The (sub-)set of CPUID attributes to publish is configurable via the
`attributeBlacklist` and `attributeWhitelist` cpuid options of the cpu source.
If whitelist is specified, only whitelisted attributes will be published. With
blacklist, only blacklisted attributes are filtered out. `attributeWhitelist`
has priority over `attributeBlacklist`. For examples and more information
about configurability, see [Configuration Options](#configuration-options).
By default, the following CPUID flags have been blacklisted:
BMI1, BMI2, CLMUL, CMOV, CX16, ERMS, F16C, HTT, LZCNT, MMX, MMXEXT, NX, POPCNT,
RDRAND, RDSEED, RDTSCP, SGX, SSE, SSE2, SSE3, SSE4.1, SSE4.2 and SSSE3.
**NOTE** The cpuid features advertise *supported* CPU capabilities, that is, a
capability might be supported but not enabled.
#### X86 CPUID Attributes (Partial List)
| Attribute | Description |
| --------- | ---------------------------------------------------------------- |
| ADX | Multi-Precision Add-Carry Instruction Extensions (ADX)
| AESNI | Advanced Encryption Standard (AES) New Instructions (AES-NI)
| AVX | Advanced Vector Extensions (AVX)
| AVX2 | Advanced Vector Extensions 2 (AVX2)
#### Arm CPUID Attribute (Partial List)
| Attribute | Description |
| --------- | ---------------------------------------------------------------- |
| IDIVA | Integer divide instructions available in ARM mode
| IDIVT | Integer divide instructions available in Thumb mode
| THUMB | Thumb instructions
| FASTMUL | Fast multiplication
| VFP | Vector floating point instruction extension (VFP)
| VFPv3 | Vector floating point extension v3
| VFPv4 | Vector floating point extension v4
| VFPD32 | VFP with 32 D-registers
| HALF | Half-word loads and stores
| EDSP | DSP extensions
| NEON | NEON SIMD instructions
| LPAE | Large Physical Address Extensions
#### Arm64 CPUID Attribute (Partial List)
| Attribute | Description |
| --------- | ---------------------------------------------------------------- |
| AES | Announcing the Advanced Encryption Standard
| EVSTRM | Event Stream Frequency Features
| FPHP | Half Precision(16bit) Floating Point Data Processing Instructions
| ASIMDHP | Half Precision(16bit) Asimd Data Processing Instructions
| ATOMICS | Atomic Instructions to the A64
| ASIMRDM | Support for Rounding Double Multiply Add/Subtract
| PMULL | Optional Cryptographic and CRC32 Instructions
| JSCVT | Perform Conversion to Match Javascript
| DCPOP | Persistent Memory Support
### Custom
The Custom feature source allows the user to define features based on a mix of
predefined rules. A rule is provided input witch affects its process of
matching for a defined feature.
To aid in making Custom Features clearer, we define a general and a per rule
nomenclature, keeping things as consistent as possible.
#### General Nomenclature & Definitions
```
Rule :Represents a matching logic that is used to match on a feature.
Rule Input :The input a Rule is provided. This determines how a Rule performs the match operation.
Matcher :A composition of Rules, each Matcher may be composed of at most one instance of each Rule.
```
#### Custom Features Format (using the Nomenclature defined above)
```yaml
- name: <feature name>
matchOn:
- <Rule-1>: <Rule-1 Input>
[<Rule-2>: <Rule-2 Input>]
- <Matcher-2>
- ...
- ...
- <Matcher-N>
- <custom feature 2>
- ...
- ...
- <custom feature M>
```
#### Matching process
Specifying Rules to match on a feature is done by providing a list of Matchers.
Each Matcher contains one or more Rules.
Logical _OR_ is performed between Matchers and logical _AND_ is performed
between Rules of a given Matcher.
#### Rules
##### PciId Rule
###### Nomenclature
```
Attribute :A PCI attribute.
Element :An identifier of the PCI attribute.
```
The PciId Rule allows matching the PCI devices in the system on the following
Attributes: `class`,`vendor` and `device`. A list of Elements is provided for
each Attribute.
###### Format
```yaml
pciId :
class: [<class id>, ...]
vendor: [<vendor id>, ...]
device: [<device id>, ...]
```
Matching is done by performing a logical _OR_ between Elements of an Attribute
and logical _AND_ between the specified Attributes for each PCI device in the
system. At least one Attribute must be specified. Missing attributes will not
partake in the matching process.
##### UsbId Rule
###### Nomenclature
```
Attribute :A USB attribute.
Element :An identifier of the USB attribute.
```
The UsbId Rule allows matching the USB devices in the system on the following
Attributes: `class`,`vendor` and `device`. A list of Elements is provided for
each Attribute.
###### Format
```yaml
usbId :
class: [<class id>, ...]
vendor: [<vendor id>, ...]
device: [<device id>, ...]
```
Matching is done by performing a logical _OR_ between Elements of an Attribute
and logical _AND_ between the specified Attributes for each USB device in the
system. At least one Attribute must be specified. Missing attributes will not
partake in the matching process.
##### LoadedKMod Rule
###### Nomenclature
```
Element :A kernel module
```
The LoadedKMod Rule allows matching the loaded kernel modules in the system
against a provided list of Elements.
###### Format
```yaml
loadedKMod : [<kernel module>, ...]
```
Matching is done by performing logical _AND_ for each provided Element, i.e
the Rule will match if all provided Elements (kernel modules) are loaded in the
system.
##### CpuId Rule
###### Nomenclature
```
Element :A CPUID flag
```
The Rule allows matching the available CPUID flags in the system against a
provided list of Elements.
###### Format
```yaml
cpuId : [<CPUID flag string>, ...]
```
Matching is done by performing logical _AND_ for each provided Element, i.e the
Rule will match if all provided Elements (CPUID flag strings) are available in
the system.
##### Kconfig Rule
###### Nomenclature
```
Element :A Kconfig option
```
The Rule allows matching the kconfig options in the system against a provided
list of Elements.
###### Format
```yaml
kConfig: [<kernel config option ('y' or 'm') or '=<value>'>, ...]
```
Matching is done by performing logical _AND_ for each provided Element, i.e the
Rule will match if all provided Elements (kernel config options) are enabled
(`y` or `m`) or matching `=<value>` in the kernel.
#### Example
```yaml
custom:
- name: "my.kernel.feature"
matchOn:
- loadedKMod: ["kmod1", "kmod2"]
- name: "my.pci.feature"
matchOn:
- pciId:
vendor: ["15b3"]
device: ["1014", "1017"]
- name: "my.usb.feature"
matchOn:
- usbId:
vendor: ["1d6b"]
device: ["0003"]
- name: "my.combined.feature"
matchOn:
- loadedKMod : ["vendor_kmod1", "vendor_kmod2"]
pciId:
vendor: ["15b3"]
device: ["1014", "1017"]
- name: "my.accumulated.feature"
matchOn:
- loadedKMod : ["some_kmod1", "some_kmod2"]
- pciId:
vendor: ["15b3"]
device: ["1014", "1017"]
- name: "my.kernel.featureneedscpu"
matchOn:
- kConfig: ["KVM_INTEL"]
- cpuId: ["VMX"]
- name: "my.kernel.modulecompiler"
matchOn:
- kConfig: ["GCC_VERSION=100101"]
loadedKMod: ["kmod1"]
```
__In the example above:__
- A node would contain the label:
`feature.node.kubernetes.io/custom-my.kernel.feature=true` if the node has
`kmod1` _AND_ `kmod2` kernel modules loaded.
- A node would contain the label:
`feature.node.kubernetes.io/custom-my.pci.feature=true` if the node contains
a PCI device with a PCI vendor ID of `15b3` _AND_ PCI device ID of `1014` _OR_
`1017`.
- A node would contain the label:
`feature.node.kubernetes.io/custom-my.usb.feature=true` if the node contains
a USB device with a USB vendor ID of `1d6b` _AND_ USB device ID of `0003`.
- A node would contain the label:
`feature.node.kubernetes.io/custom-my.combined.feature=true` if
`vendor_kmod1` _AND_ `vendor_kmod2` kernel modules are loaded __AND__ the node
contains a PCI device
with a PCI vendor ID of `15b3` _AND_ PCI device ID of `1014` _or_ `1017`.
- A node would contain the label:
`feature.node.kubernetes.io/custom-my.accumulated.feature=true` if
`some_kmod1` _AND_ `some_kmod2` kernel modules are loaded __OR__ the node
contains a PCI device
with a PCI vendor ID of `15b3` _AND_ PCI device ID of `1014` _OR_ `1017`.
- A node would contain the label:
`feature.node.kubernetes.io/custom-my.kernel.featureneedscpu=true` if
`KVM_INTEL` kernel config is enabled __AND__ the node CPU supports `VMX`
virtual machine extensions
- A node would contain the label:
`feature.node.kubernetes.io/custom-my.kernel.modulecompiler=true` if the
in-tree `kmod1` kernel module is loaded __AND__ it's built with
`GCC_VERSION=100101`.
#### Statically defined features
Some feature labels which are common and generic are defined statically in the
`custom` feature source. A user may add additional Matchers to these feature
labels by defining them in the `nfd-worker` configuration file.
| Feature | Attribute | Description |
| ------- | --------- | -----------|
| rdma | capable | The node has an RDMA capable Network adapter |
| rdma | enabled | The node has the needed RDMA modules loaded to run RDMA traffic |
### IOMMU
The **iommu** feature source supports the following labels:
| Feature name | Description |
| :------------: | :---------------------------------------------------------: |
| enabled | IOMMU is present and enabled in the kernel
### Kernel
The **kernel** feature source supports the following labels:
| Feature | Attribute | Description |
| ------- | ------------------- | -------------------------------------------- |
| config | &lt;option name&gt; | Kernel config option is enabled (set 'y' or 'm').<br> Default options are `NO_HZ`, `NO_HZ_IDLE`, `NO_HZ_FULL` and `PREEMPT`
| selinux | enabled | Selinux is enabled on the node
| version | full | Full kernel version as reported by `/proc/sys/kernel/osrelease` (e.g. '4.5.6-7-g123abcde')
| | major | First component of the kernel version (e.g. '4')
| | minor | Second component of the kernel version (e.g. '5')
| | revision | Third component of the kernel version (e.g. '6')
Kernel config file to use, and, the set of config options to be detected are
configurable.
See [configuration options](#configuration-options) for more information.
### Memory
The **memory** feature source supports the following labels:
| Feature | Attribute | Description |
| ------- | --------- | ------------------------------------------------------ |
| numa | | Multiple memory nodes i.e. NUMA architecture detected
| nv | present | NVDIMM device(s) are present
| nv | dax | NVDIMM region(s) configured in DAX mode are present
### Network
The **network** feature source supports the following labels:
| Feature | Attribute | Description |
| ------- | ---------- | ----------------------------------------------------- |
| sriov | capable | [Single Root Input/Output Virtualization][sriov] (SR-IOV) enabled Network Interface Card(s) present
| | configured | SR-IOV virtual functions have been configured
### PCI
The **pci** feature source supports the following labels:
| Feature | Attribute | Description |
| -------------------- | ------------- | ------------------------------------- |
| &lt;device label&gt; | present | PCI device is detected
| &lt;device label&gt; | sriov.capable | [Single Root Input/Output Virtualization][sriov] (SR-IOV) enabled PCI device present
`<device label>` is composed of raw PCI IDs, separated by underscores. The set
of fields used in `<device label>` is configurable, valid fields being `class`,
`vendor`, `device`, `subsystem_vendor` and `subsystem_device`. Defaults are
`class` and `vendor`. An example label using the default label fields:
```
feature.node.kubernetes.io/pci-1200_8086.present=true
```
Also the set of PCI device classes that the feature source detects is
configurable. By default, device classes (0x)03, (0x)0b40 and (0x)12, i.e.
GPUs, co-processors and accelerator cards are detected.
### USB
The **usb** feature source supports the following labels:
| Feature | Attribute | Description |
| -------------------- | ------------- | ------------------------------------- |
| &lt;device label&gt; | present | USB device is detected
`<device label>` is composed of raw USB IDs, separated by underscores. The set
of fields used in `<device label>` is configurable, valid fields being `class`,
`vendor`, and `device`. Defaults are `class`, `vendor` and `device`. An
example label using the default label fields:
```
feature.node.kubernetes.io/usb-fe_1a6e_089a.present=true
```
See [configuration options](#configuration-options) for more information on NFD
config.
### Storage
The **storage** feature source supports the following labels:
| Feature name | Description |
| ------------------ | ------------------------------------------------------- |
| nonrotationaldisk | Non-rotational disk, like SSD, is present in the node
### System
The **system** feature source supports the following labels:
| Feature | Attribute | Description |
| ----------- | ---------------- | --------------------------------------------|
| os_release | ID | Operating system identifier
| | VERSION_ID | Operating system version identifier (e.g. '6.7')
| | VERSION_ID.major | First component of the OS version id (e.g. '6')
| | VERSION_ID.minor | Second component of the OS version id (e.g. '7')
### Local -- User-specific Features
NFD has a special feature source named *local* which is designed for getting
the labels from user-specific feature detector. It provides a mechanism for
users to implement custom feature sources in a pluggable way, without modifying
nfd source code or Docker images. The local feature source can be used to
advertise new user-specific features, and, for overriding labels created by the
other feature sources.
The *local* feature source gets its labels by two different ways:
- It tries to execute files found under
`/etc/kubernetes/node-feature-discovery/source.d/` directory. The hook files
must be executable and they are supposed to print all discovered features in
`stdout`, one per line. With ELF binaries static linking is recommended as
the selection of system libraries available in the NFD release image is very
limited. Other runtimes currently supported by the NFD stock image are bash
and perl.
- It reads files found under
`/etc/kubernetes/node-feature-discovery/features.d/` directory. The file
content is expected to be similar to the hook output (described above).
These directories must be available inside the Docker image so Volumes and
VolumeMounts must be used if standard NFD images are used. The given template
files mount by default the `source.d` and the `features.d` directories
respectively from `/etc/kubernetes/node-feature-discovery/source.d/` and
`/etc/kubernetes/node-feature-discovery/features.d/` from the host. You should
update them to match your needs.
In both cases, the labels can be binary or non binary, using either `<name>` or
`<name>=<value>` format.
Unlike the other feature sources, the name of the file, instead of the name of
the feature source (that would be `local` in this case), is used as a prefix in
the label name, normally. However, if the `<name>` of the label starts with a
slash (`/`) it is used as the label name as is, without any additional prefix.
This makes it possible for the user to fully control the feature label names,
e.g. for overriding labels created by other feature sources.
You can also override the default namespace of your labels using this format:
`<namespace>/<name>[=<value>]`. You must whitelist your namespace using the
`--extra-label-ns` option on the master. In this case, the name of the
file will not be added to the label name. For example, if you want to add the
label `my.namespace.org/my-label=value`, your hook output or file must contains
`my.namespace.org/my-label=value` and you must add
`--extra-label-ns=my.namespace.org` on the master command line.
`stderr` output of the hooks is propagated to NFD log so it can be used for
debugging and logging.
#### Injecting Labels from Other Pods
One use case for the hooks and/or feature files is detecting features in other
Pods outside NFD, e.g. in Kubernetes device plugins. It is possible to mount
the `source.d` and/or `features.d` directories common with the NFD Pod and
deploy the custom hooks/features there. NFD will periodically scan the
directories and run any hooks and read any feature files it finds. The
[example nfd-worker deployment template](https://github.com/kubernetes-sigs/node-feature-discovery/blob/master/nfd-worker-daemonset.yaml.template#L69)
contains `hostPath` mounts for `sources.d` and `features.d` directories. By
using the same mounts in the secondary Pod (e.g. device plugin) you have
created a shared area for delivering hooks and feature files to NFD.
#### A Hook Example
User has a shell script
`/etc/kubernetes/node-feature-discovery/source.d/my-source` which has the
following `stdout` output:
```
MY_FEATURE_1
MY_FEATURE_2=myvalue
/override_source-OVERRIDE_BOOL
/override_source-OVERRIDE_VALUE=123
override.namespace/value=456
```
which, in turn, will translate into the following node labels:
```
feature.node.kubernetes.io/my-source-MY_FEATURE_1=true
feature.node.kubernetes.io/my-source-MY_FEATURE_2=myvalue
feature.node.kubernetes.io/override_source-OVERRIDE_BOOL=true
feature.node.kubernetes.io/override_source-OVERRIDE_VALUE=123
override.namespace/value=456
```
#### A File Example
User has a file `/etc/kubernetes/node-feature-discovery/features.d/my-source`
which contains the following lines:
```
MY_FEATURE_1
MY_FEATURE_2=myvalue
/override_source-OVERRIDE_BOOL
/override_source-OVERRIDE_VALUE=123
override.namespace/value=456
```
which, in turn, will translate into the following node labels:
```
feature.node.kubernetes.io/my-source-MY_FEATURE_1=true
feature.node.kubernetes.io/my-source-MY_FEATURE_2=myvalue
feature.node.kubernetes.io/override_source-OVERRIDE_BOOL=true
feature.node.kubernetes.io/override_source-OVERRIDE_VALUE=123
override.namespace/value=456
```
NFD tries to run any regular files found from the hooks directory. Any
additional data files your hook might need (e.g. a configuration file) should
be placed in a separate directory in order to avoid NFD unnecessarily trying to
execute these. You can use a subdirectory under the hooks directory, for
example `/etc/kubernetes/node-feature-discovery/source.d/conf/`.
**NOTE!** NFD will blindly run any executables placed/mounted in the hooks
directory. It is the user's responsibility to review the hooks for e.g.
possible security implications.
**NOTE!** Be careful when creating and/or updating hook or feature files while
NFD is running. In order to avoid race conditions you should write into a
temporary file (outside the `source.d` and `features.d` directories), and,
atomically create/update the original file by doing a filesystem move
operation.
## Extended resources
This feature is experimental and by no means a replacement for the usage of
device plugins.
Labels which have integer values, can be promoted to Kubernetes extended
resources by listing them to the master `--resource-labels` command line flag.
These labels won't then show in the node label section, they will appear only
as extended resources.
An example use-case for the extended resources could be based on a hook which
creates a label for the node SGX EPC memory section size. By giving the name of
that label in the `--resource-labels` flag, that value will then turn into an
extended resource of the node, allowing PODs to request that resource and the
Kubernetes scheduler to schedule such PODs to only those nodes which have a
sufficient capacity of said resource left.
Similar to labels, the default namespace `feature.node.kubernetes.io` is
automatically prefixed to the extended resource, if the promoted label doesn't
have a namespace.
Example usage of the command line arguments, using a new namespace:
`nfd-master --resource-labels=my_source-my.feature,sgx.some.ns/epc --extra-label-ns=sgx.some.ns`
The above would result in following extended resources provided that related
labels exist:
```
sgx.some.ns/epc: <label value>
feature.node.kubernetes.io/my_source-my.feature: <label value>
```
<!-- Links -->
[intel-rdt]: http://www.intel.com/content/www/us/en/architecture-and-technology/resource-director-technology.html
[intel-pstate]: https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt
[intel-sst]: https://www.intel.com/content/www/us/en/architecture-and-technology/speed-select-technology-article.html
[sriov]: http://www.intel.com/content/www/us/en/pci-express/pci-sig-sr-iov-primer-sr-iov-technology-paper.html

44
docs/get-started/index.md Normal file
View file

@ -0,0 +1,44 @@
---
title: "Get started"
layout: default
sort: 1
---
# Node Feature Discovery
Welcome to Node Feature Discovery -- a Kubernetes add-on for detecting hardware
features and system configuration!
Continue to:
- **[Introduction](get-started/introduction.md)** for more details on the
project.
- **[Quick start](get-started/quick-start.md)** for quick step-by-step
instructions on how to get NFD running on your cluster.
## Quick-start -- the short-short version
```bash
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/master/nfd-master.yaml.template
namespace/node-feature-discovery created
...
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/master/nfd-worker-daemonset.yaml.template
daemonset.apps/nfd-worker created
$ kubectl -n node-feature-discovery get all
NAME READY STATUS RESTARTS AGE
pod/nfd-master-555458dbbc-sxg6w 1/1 Running 0 56s
pod/nfd-worker-mjg9f 1/1 Running 0 17s
...
$ kubectl get no -o json | jq .items[].metadata.labels
{
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"feature.node.kubernetes.io/cpu-cpuid.ADX": "true",
"feature.node.kubernetes.io/cpu-cpuid.AESNI": "true",
...
```

View file

@ -0,0 +1,91 @@
---
title: "Introduction"
layout: default
sort: 1
---
# Introduction
{: .no_toc }
## Table of Contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
This software enables node feature discovery for Kubernetes. It detects
hardware features available on each node in a Kubernetes cluster, and
advertises those features using node labels.
NFD consists of two software components:
1. nfd-master
1. nfd-worker
## NFD-Master
NFD-Master is the daemon responsible for communication towards the Kubernetes
API. That is, it receives labeling requests from the worker and modifies node
objects accordingly.
## NFD-Worker
NFD-Worker is a daemon responsible for feature detection. It then communicates
the information to nfd-master which does the actual node labeling. One
instance of nfd-worker is supposed to be running on each node of the cluster,
## Feature Discovery
Feature discovery is divided into domain-specific feature sources:
- CPU
- IOMMU
- Kernel
- Memory
- Network
- PCI
- Storage
- System
- USB
- Custom (rule-based custom features)
- Local (hooks for user-specific features)
Each feature source is responsible for detecting a set of features which. in
turn, are turned into node feature labels. Feature labels are prefixed with
`feature.node.kubernetes.io/` and also contain the name of the feature source.
Non-standard user-specific feature labels can be created with the local and
custom feature sources.
An overview of the default feature labels:
```json
{
"feature.node.kubernetes.io/cpu-<feature-name>": "true",
"feature.node.kubernetes.io/custom-<feature-name>": "true",
"feature.node.kubernetes.io/iommu-<feature-name>": "true",
"feature.node.kubernetes.io/kernel-<feature name>": "<feature value>",
"feature.node.kubernetes.io/memory-<feature-name>": "true",
"feature.node.kubernetes.io/network-<feature-name>": "true",
"feature.node.kubernetes.io/pci-<device label>.present": "true",
"feature.node.kubernetes.io/storage-<feature-name>": "true",
"feature.node.kubernetes.io/system-<feature name>": "<feature value>",
"feature.node.kubernetes.io/usb-<device label>.present": "<feature value>",
"feature.node.kubernetes.io/<file name>-<feature name>": "<feature value>"
}
```
## Node Annotations
NFD also annotates nodes it is running on:
| Annotation | Description
| ----------------------------------------- | -----------
| nfd.node.kubernetes.io/master.version | Version of the nfd-master instance running on the node. Informative use only.
| nfd.node.kubernetes.io/worker.version | Version of the nfd-worker instance running on the node. Informative use only.
| nfd.node.kubernetes.io/feature-labels | Comma-separated list of node labels managed by NFD. NFD uses this internally so must not be edited by users.
| nfd.node.kubernetes.io/extended-resources | Comma-separated list of node extended resources managed by NFD. NFD uses this internally so must not be edited by users.
Unapplicable annotations are not created, i.e. for example master.version is only created on nodes running nfd-master.

View file

@ -0,0 +1,78 @@
---
title: "Quick Start"
layout: default
sort: 2
---
# Quick Start
Minimal steps to deploy latest released version of NFD in your cluster.
## Installation
Deploy nfd-master -- creates a new namespace, service and required RBAC rules
```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/master/nfd-master.yaml.template
```
Deploy nfd-worker as a daemonset
```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/master/nfd-worker-daemonset.yaml.template
```
## Verify
Wait until NFD master and worker are running.
```bash
$ kubectl -n node-feature-discovery get ds,deploy
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/nfd-worker 3 3 3 3 3 <none> 5s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nfd-master 1/1 1 1 17s
```
Check that NFD feature labels have been created
```bash
$ kubectl get no -o json | jq .items[].metadata.labels
{
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"feature.node.kubernetes.io/cpu-cpuid.ADX": "true",
"feature.node.kubernetes.io/cpu-cpuid.AESNI": "true",
"feature.node.kubernetes.io/cpu-cpuid.AVX": "true",
...
```
## Use Node Labels
Create a pod targeting a distinguishing feature (select a valid feature from
the list printed on the previous step)
```bash
$ cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: feature-dependent-pod
spec:
containers:
- image: k8s.gcr.io/pause
name: pause
nodeSelector:
# Select a valid feature
feature.node.kubernetes.io/cpu-cpuid.AESNI: 'true'
EOF
pod/feature-dependent-pod created
```
See that the pod is running on a desired node
```bash
$ kubectl get po feature-dependent-pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
feature-dependent-pod 1/1 Running 0 23s 10.36.0.4 node-2 <none> <none>
```

1
docs/index.html Normal file
View file

@ -0,0 +1 @@
<meta http-equiv="refresh" content="0; URL='get-started/index.html'" />

View file

@ -1,5 +0,0 @@
---
title: Node Feature Discovery Documentation
layout: default
---
***UNDER CONSTRUCTION...***

View file

@ -1,34 +0,0 @@
#!/usr/bin/env bash
this=`basename $0`
if [ $# -gt 1 ] || [ "$1" == "-h" ] || [ "$1" == "--help" ]; then
echo Usage: $this [IMAGE[:TAG]]
exit 1
fi
IMAGE=$1
if [ -n "$IMAGE" ]; then
if [ ! -f nfd-worker-job.yaml ]; then
make IMAGE_TAG=$IMAGE nfd-worker-job.yaml
else
# Keep existing nfd-worker-job.yaml, only update image.
sed -E "s,^(\s*)image:.+$,\1image: $IMAGE," -i nfd-worker-job.yaml
fi
fi
if [ ! -f nfd-worker-job.yaml ]; then
# Missing image info for the labeling job.
echo "nfd-worker-job.yaml missing."
echo "Run 'make nfd-worker-job.yaml', use the template or provide IMAGE (see --help)."
exit 2
fi
# Get the number of nodes in Ready state in the Kubernetes cluster
NumNodes=$(kubectl get nodes | grep -i ' ready ' | wc -l)
# We set the .spec.completions and .spec.parallelism to the node count
# We set the NODE_NAME environment variable to get the Kubernetes node object.
sed -e "s/completions:.*$/completions: $NumNodes/" \
-e "s/parallelism:.*$/parallelism: $NumNodes/" \
-i nfd-worker-job.yaml
kubectl create -f nfd-worker-job.yaml

View file

@ -25,6 +25,8 @@ rules:
- get
- patch
- update
# List only needed for --prune
- list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding

43
nfd-prune.yaml.template Normal file
View file

@ -0,0 +1,43 @@
apiVersion: batch/v1
kind: Job
metadata:
name: nfd-prune
namespace: node-feature-discovery
labels:
app: nfe-prune
spec:
completions: 1
template:
metadata:
labels:
app: nfd-prune
spec:
serviceAccount: nfd-master
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: "node-role.kubernetes.io/master"
operator: In
values: [""]
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Equal"
value: ""
effect: "NoSchedule"
containers:
- image: gcr.io/k8s-staging-nfd/node-feature-discovery:master
name: nfd-master
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
readOnlyRootFilesystem: true
runAsNonRoot: true
command:
- "nfd-master"
args:
- "--prune"
restartPolicy: Never

View file

@ -6,8 +6,8 @@ metadata:
name: nfd-worker
namespace: node-feature-discovery
spec:
completions: COMPLETION_COUNT
parallelism: PARALLELISM_COUNT
completions: NUM_NODES
parallelism: NUM_NODES
template:
metadata:
labels: