mirror of
https://github.com/kubernetes-sigs/node-feature-discovery.git
synced 2024-12-14 11:57:51 +00:00
Node feature discovery for Kubernetes
fb6484fb8d
This patch mitigates inadvertent termination of nfd-master pods by the liveness probe on big clusters. With a recent change nfd-master started to wait (block) for informer caches to sync before starting the main loop. Consequently, this change also made the gRPC health enpoint to not respond until the caches have been synced. In big clusters the syncing the NodeFeature object cache takes a long time as the objects are big and there's (at least) one per each node in the cluster. Thus, in big clusters, the liveness probe kicks in and kills the nfd-master pod before it's ready. |
||
---|---|---|
.github | ||
api | ||
cmd | ||
demo | ||
deployment | ||
docs | ||
enhancements | ||
examples | ||
hack | ||
pkg | ||
scripts | ||
source | ||
test | ||
testdata | ||
.dockerignore | ||
.gitignore | ||
cloudbuild.yaml | ||
code-of-conduct.md | ||
codecov.yml | ||
CONTRIBUTING.md | ||
Dockerfile | ||
Dockerfile_generator | ||
go.mod | ||
go.sum | ||
LICENSE | ||
Makefile | ||
netlify.toml | ||
OWNERS | ||
README.md | ||
SECURITY_CONTACTS | ||
Tiltfile |
Node Feature Discovery
Welcome to Node Feature Discovery – a Kubernetes add-on for detecting hardware features and system configuration!
See our Documentation for detailed instructions and reference
Quick-start – the short-short version
$ kubectl apply -k "https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.16.5"
namespace/node-feature-discovery created
customresourcedefinition.apiextensions.k8s.io/nodefeaturerules.nfd.k8s-sigs.io created
customresourcedefinition.apiextensions.k8s.io/nodefeatures.nfd.k8s-sigs.io created
serviceaccount/nfd-gc created
serviceaccount/nfd-master created
serviceaccount/nfd-worker created
role.rbac.authorization.k8s.io/nfd-worker created
clusterrole.rbac.authorization.k8s.io/nfd-gc created
clusterrole.rbac.authorization.k8s.io/nfd-master created
rolebinding.rbac.authorization.k8s.io/nfd-worker created
clusterrolebinding.rbac.authorization.k8s.io/nfd-gc created
clusterrolebinding.rbac.authorization.k8s.io/nfd-master created
configmap/nfd-master-conf created
configmap/nfd-worker-conf created
deployment.apps/nfd-gc created
deployment.apps/nfd-master created
daemonset.apps/nfd-worker created
$ kubectl -n node-feature-discovery get all
NAME READY STATUS RESTARTS AGE
pod/nfd-gc-565fc85d9b-94jpj 1/1 Running 0 18s
pod/nfd-master-6796d89d7b-qccrq 1/1 Running 0 18s
pod/nfd-worker-nwdp6 1/1 Running 0 18s
...
$ kubectl get no -o json | jq ".items[].metadata.labels"
{
"kubernetes.io/arch": "amd64",
"kubernetes.io/os": "linux",
"feature.node.kubernetes.io/cpu-cpuid.ADX": "true",
"feature.node.kubernetes.io/cpu-cpuid.AESNI": "true",
...