Switch to fully statically linked binaries and use scratch as a base
image.
Switching to the virtually empty scratch base image means that the
default/minimal NFD image only supports running hooks that are truly
statically linked (e.g. normal go binaries that are "almost" statically
linked stop working). The documentation has been already stating this
(i.e. that only statically-linked binaries are supported) - i.e. we have
had no promise of supporting other than that. Also, hooks are now
deprecated and even disabled by default so the possibility of real user
impact should be small.
Get away with the jekyll:3.8 image which is already four years old. Use
the ruby instead. The jekyll image did not bring any value (more
problems, if anything) as we install/update jekyll and all other gems
with byndler nevertheless (image was jekyll:3.8 but we use jekyll
v3.9.3).
This PR aims to optimize the process of updating nodes with
corresponding features. In fact, previously, we were updating nodes
sequentially even though they are independent from each other.
Therefore, we integrated new components: LabelersNodePool which is
responsible for spininng a goroutine whenever there's a request for
updating nodes, and a Workqueue which is responsible for holding nodes names
that should be updated.
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
Previously we were using the default, which even if equal to 0, still
means 10 minute timout in practice (with the way we run the tests with
invoking go test directly). With the addition of latest e2e tests we
hit the limit and got bitten by it. Set the timeout to 1 hour which
should be enough for anyone...
Similar to the nfd-worker, in this PR we want to support the
dynamic run-time configurability through a config file for the nfd-master.
We'll use a json or yaml configuration file along with the fsnotify in
order to watch for changes in the config file. As a result, we're
allowing dynamic control of logging params, allowed namespaces,
extended resources, label whitelisting, and denied namespaces.
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
Add code coverage reporting so that the report is then available
to the PR author and reviewers via codecov GitHub app.
Signed-off-by: Muyassarov, Feruzjon <feruzjon.muyassarov@intel.com>
Make distroless/base as the base image for the default image,
effectively making the minimal image as the default. Add a new "full"
image variant that corresponds the previous default image. The
"*-minimal" container image tag is provided for backwards compatibility.
The practical user impact of this change is that hook support is limited
to statically linked ELF binaries. Bash or Perl scripts are not
supported by the default image, anymore, but the new "full" image
variant can be used for backwards compatibility.
- Add a helm template with a config example for the exclude-list.
- Add mount for the topology-updater.conf file
- Update the templates Makefile target
Signed-off-by: Talor Itzhak <titzhak@redhat.com>
Refactor the code, moving the hostpath helper functionality to new
"pkg/utils/hostpath" package. This breaks odd-ish dependency
"pkg/utils" -> "source".
Use the same base image (golang) for auto-generation as for building
container images. This makes sure that we stay in sync in terms of the
golang version used.
In some cases (CI) it is useful to run NFD e2e tests using
ephemeral clusters. To save time and bandwidth, it is also useful
to prime the ephemeral cluster with the images under test.
In these circumstances there is no risk of running a stale image,
and having a `Always` PullPolicy hardcoded actually makes
the whole exercise null.
So we add a new option, disabled by default, to make the e2e
manifest use the `IfNotPresent` pull policy, to effectively
cover this use case.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Run code auto-generation inside a container instead of the host system.
Our auto-generation depends on specific versions of a multitude of tools
(like k8s code-generator, controller-gen, protoc, mockery etc). This
made it really awkward (and error-prone) to run in the host environment,
especially if/when you needed different versions of those tools for
other projects. Making it even more unwieldy, the required versions of
tools were not neatly documented anywhere (except for git commits,
perhaps).
With this patch we have a "fixed environment", as we build a special
auto-generate-builder container which has correct versions of all the
dependencies. Using the container makes auto-generation easy to run
anywhere, independent of the host system, giving reproducibility and
reliability. Also, the patch moves the auto-generation steps out from
the makefile into a separate script, making the makefile cleaner and the
script easier to maintain.
Switch over to the "non-point-release" version of the image. Now we
always use the latest patch version of golang with latest security
fixes, for example, without the need to manually bump the version after
every point release.
This patch also makes the builder image configurable through a Makefile
variable.
For reproducible builds we should used fixed point-release versions in
release-brances.
Add auto-generated code for interfacing our CRD API. On top of this, a
CR controller can be implemented. This patch uses k8s/code-generator
for code generation. Run "make generate" in order to (re-)generate
everything. Path to the code-generator repository may need to be
specified:
K8S_CODE_GENERATOR=path/to/code-generator make apigen
Code-generator version 0.20.7 was used to create this patch. Install
k8s code-generator tools and clone the repo with:
git clone https://github.com/kubernetes/code-generator -b v0.20.7 <path/to/code-generator>
go install k8s.io/code-generator/cmd/...(at)v0.20.7
Add a cluster-scoped Custom Resource Definition for specifying labeling
rules. Nodes (node features, node objects) are cluster-level objects and
thus the natural and encouraged setup is to only have one NFD deployment
per cluster - the set of underlying features of the node stays the same
independent of how many parallel NFD deployments you have. Our extension
points (hooks, feature files and now CRs) can be be used by multiple
actors (depending on us) simultaneously. Having the CRD cluster-scoped
hopefully drives deployments in this direction. It also should make
deployment of vendor-specific labeling rules easy as there is no need to
worry about the namespace.
This patch virtually replicates the source.custom.FeatureSpec in a CRD
API (located in the pkg/apis/nfd/v1alpha1 package) with the notable
exception that "MatchOn" legacy rules are not supported. Legacy rules
are left out in order to keep the CRD simple and clean.
The duplicate functionality in source/custom will be dropped by upcoming
patches.
This patch utilizes controller-gen (from sigs.k8s.io/controller-tools)
for generating the CRD and deepcopy methods. Code can be (re-)generated
with "make generate". Install controller-gen with:
go install sigs.k8s.io/controller-tools/cmd/controller-gen@v0.7.0
Update kustomize and helm deployments to deploy the CRD.
Use 'go generate' for auto-generating code. Drop the old 'mock' and
'apigen' makefile targets. Those are replaced with a single
make generate
which (re-)generates everything.
There have been recent changes made to the noderesourcetopology API
storing the proto file generated using go-to-protobuf tool and
this code inports the proto generated in the API in the topology-updater.proto
The PRs corresponding to the changes are as follows:
https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/9https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/13
Commands used to generate topology-updater.pb.go file:
go install github.com/golang/protobuf/protoc-gen-go@v1.4.3
go mod vendor
protoc --go_opt=paths=source_relative --go_out=plugins=grpc:. pkg/topologyupdater/topology-updater.proto -I. -Ivendor
As part of implmentation of this patch, reserved (non-allocatable) CPUs
are evaluated by performing a difference between all the CPUs on a system
(determined by using ghw) and allocatable CPUs (determined by querying
GetAllocatableResources podResource API endpoint).
When aggregator creates the NUMA zones, it will skip the zone creation if
there are no allocatable resources. In this update we creates those missing
zone with zero allocatable/available resources so we won't have holes in the
array of reported zones.
Co-Authored-by: Talor Itzhak <titzhak@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Setup the topologyupdater API for gRPC communication of
nfd-topology-updater with master
We generate pb.go file to reflect latest dependency changes
using github.com/golang/protobuf/protoc-gen-go and generate
grpc files via:
`protoc pkg/topologyupdater/topology-updater.proto --go_out=plugins=grpc:.`
Please refer to: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/blob/master/pkg/apis/topology/v1alpha1/types.go
Co-Authored-by: Artyom Lukianov <alukiano@redhat.com>
Co-Authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Implement new registration infrastructure under the "source" package.
This change loosens the coupling between label sources and the
nfd-worker, making it easier to refactor and move the code around.
Also, create a separate interface (ConfigurableSource) for configurable
feature sources in order to eliminate boilerplate code.
Add safety checks to the sources that they actually implement the
interfaces they should.
In sake of consistency and predictability (of behavior) change all
methods of the sources to use pointer receivers.
Add simple unit tests for the new functionality and include source/...
into make test target.
Implement functionality virtually replicating deployment templates for
nfd-master and nfd-worker daemonset (nfd-master.yaml.template and
nfd-worker-daemonset.yaml.template) by adding a kustomize overlay named
"default".
We split the resources into multiple bases (rbac, master and
worker-daemonset) so that relevant parts are re-usable in
other deployment scenarios added later (e.g. "one-shot job", and
"combined daemonset").
This patch adds one component (components/common) doing the required
kustomization for the example deployment.
For auto-generating api(s).
Also, re-generate/refresh the gRPC with `make apigen` (with protoc
v3.17.3 and protoc-gen-go from github.com/golang/protobuf v1.5.2) to
sync up things.
* Add support for configurable runtime full and minimal images.
* Fixups and renamings.
* Change variables *_IMG_* to *_IMAGE_*
* Fix args in Dockerfile also.
cert-manager can be used to automate TLS certificate management for
nfd-master and the nfd-worker pod(s).
Add a template to deploy cert-manager CA Issuer and Certificates and
document steps how to use them.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Build a "minimal" variant of the nfd image based on
gcr.io/distroless/base. The motivations behind the minimal image are
image hardening (security) and reducing the image footprint (from ca.
108MB down to about 40MB).
The practical effect of deploying the minimal image is that no runtimes
for running worker hooks are present, not even a shell. This means that
only statically linked linked hook binaries are supported. Also, because
of the image hardening live debugging of the minimal image by attaching
to the container is not possible, and, the "full" image needs to be used
for that purpose.