1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00
Commit graph

51 commits

Author SHA1 Message Date
Kubernetes Prow Robot
77d869c4f7
Merge pull request #1242 from ArangoGutierrez/metrics
Enable metrics via prometheus operator
2023-07-21 02:26:08 -07:00
Carlos Eduardo Arango Gutierrez
e3aedd33e2
Enable metrics via prometheus operator
Expose metrics via prometheus.monitoring.coreos.com/v1

The exposed metrics are

| Metric        | Type | Meaning |
| --------------- | ---------------- | ---------------- |
|  `nfd_master_build_info`           | Gauge | Version from which nfd-master was built. |
|  `nfd_worker_build_info`           | Gauge | Version from which nfd-worker was built. |
|  `nfd_updated_nodes`           |  Counter | Time taken to label a node |
|  `nfd_crd_processing_time`          |  Gauge | Time taken to process a NodeFeatureRule CRD |
| `nfd_feature_discovery_duration_seconds` |  HistogramVec | Time taken to discover features on a node |

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
2023-07-21 10:59:52 +02:00
Markus Lehtonen
045eb28dbe go.mod: update kubernetes to v1.27.4 2023-07-20 14:29:03 +03:00
Fabiano Fidêncio
8a65d8f5a1 go.mod: Update cpuid to its v2.2.5 release
Let's update the cpuid to its v2.2.5 release, released on June 2nd,
2023, as it brings in information about TDX guests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-06-02 17:19:27 +02:00
Markus Lehtonen
1cdaa44c03 go.mod: update deps 2023-04-22 10:12:49 +03:00
Markus Lehtonen
ba4b9b3432 go.mod: update kubernetes to v1.27.1 2023-04-18 20:51:51 +03:00
Markus Lehtonen
436f679cb1 go.mod: update kubernetes to v1.26.3 2023-03-31 19:41:18 +03:00
Markus Lehtonen
5e5b1749d9 go.mod: update kubernetes to v1.26.2
Also updates golang.org/x/net to v0.7.0.
2023-03-10 15:31:16 +02:00
AhmedGrati
16abfd7b0e test: implement e2e test of the deny-label-ns flag
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-03-10 11:11:36 +01:00
Jose Luis Ojosnegros Manchón
b340d112a8 topology-updater:compute pod set fingerprint
Add an option to compute the fingerprint of the current pod set on each
node.

Report this new fingerprint using an attribute in NRT object.
2023-02-22 10:22:50 +01:00
Muyassarov, Feruzjon
0e2f2c4587 go.mod: bump cpuid to v2.2.4
Bump cpuid version to v2.2.4 in the go.mod so that WRMSRNS (
Non-Serializing Write to Model Specific Register) and MSRLIST
(Read/Write List of Model Specific Registers) instructions are
detectable.

Signed-off-by: Muyassarov, Feruzjon <feruzjon.muyassarov@intel.com>
2023-02-20 22:58:59 +02:00
Jose Luis Ojosnegros Manchón
d1d1eda0d2 nrt-api: Update to v0.1.0 to use v1alpha2 2023-02-09 12:03:18 +01:00
PiotrProkop
9356efe811 Upgrade github.com/k8stopologyawareschedwg/noderesourcetopology-api to v0.0.13
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2023-01-09 13:15:59 +01:00
Muyassarov, Feruzjon
d9dc4b09d5 Bump cpuid to v2.2.3
Bump cpuid to v2.2.3 which adds support for detecting Intel Sierra
Forest instructions like AVXIFMA, AVXNECONVERT, AVXVNNIINT8 and
CMPCCXADD.
Signed-off-by: Muyassarov, Feruzjon <feruzjon.muyassarov@intel.com>
2022-12-30 11:42:05 +02:00
Feruzjon Muyassarov
409312e111 Bump go.mod k8s.io to 1.26
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
2022-12-13 12:12:46 +02:00
Markus Lehtonen
0834ec5cbf go.mod: update to klauspost/cpuid to v2.2.2
Support detection of Intel TME (Total Memory Encryption) plus AMXFP16
and PREFETCHI.
2022-12-07 13:58:19 +02:00
Feruzjon Muyassarov
bb7e6d7d47 Bump Kubernetes to v1.25.3
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
2022-10-21 15:35:30 +03:00
Markus Lehtonen
449b0b2199 go.mod: update kubernetes to v1.25.0 2022-09-09 10:55:03 +03:00
Mikko Ylinen
026fcb2199 go.mod: update github.com/klauspost/cpuid to v2.1.0
The release relaxes detection of features that have non-AVX512
versions etc..

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-08-09 11:25:39 +03:00
Kubernetes Prow Robot
3638304c10
Merge pull request #838 from marquiz/devel/go-cmp
go.mod: update github.com/google/go-cmp to v0.5.8
2022-08-08 13:56:06 -07:00
Markus Lehtonen
22ad55bf4c go.mod: update github.com/google/go-cmp to v0.5.8 2022-06-29 10:41:30 +03:00
Markus Lehtonen
33f0df4ec4 go.mod: update github.com/klauspost/cpuid to v2.0.14
Adds e.g. detection of Intel TME.
2022-06-29 10:27:20 +03:00
Markus Lehtonen
735285e3ef go.mod: update kubernetes to v1.24.2 2022-06-28 15:34:22 +03:00
Carlos Eduardo Arango Gutierrez
87b29f695d
Bump Go to 1.18
Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>
2022-03-21 10:25:32 -04:00
Mikko Ylinen
52a14675ae go.mod: update to klauspost/cpuid/v2@v2.0.11
The new version adds Control-flow Enforcement Technology (CET)
bits detection.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-02-08 11:24:00 +02:00
Markus Lehtonen
d1bd603052 go.mod: bump kubernetes to v1.23.1
Update k/k to the latest release and sync all related dependencies.

Align e2e-tests with changes in the k8s e2e test framework.
2022-01-12 16:43:21 +02:00
Swati Sehgal
b444ef95a8 NFD-Topology-Updater: Bump NRT API to version v0.0.12
The NodeResourceTopology API has been made cluster
scoped as in the current context a CR corresponds to
a Node and since Node is a cluster scoped resource it
makes sense to make NRT cluster scoped as well.

Ref: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/18
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-11-16 13:28:23 +00:00
Swati Sehgal
a311719d1e topologyupdater: Updates based on latest changes made to CRD API
There have been recent changes made to the noderesourcetopology API
storing the proto file generated using go-to-protobuf tool and
this code inports the proto generated in the API in the topology-updater.proto
The PRs corresponding to the changes are as follows:
https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/9
https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/pull/13

Commands used to generate topology-updater.pb.go file:

go install github.com/golang/protobuf/protoc-gen-go@v1.4.3
go mod vendor
protoc --go_opt=paths=source_relative  --go_out=plugins=grpc:. pkg/topologyupdater/topology-updater.proto -I. -Ivendor

As part of implmentation of this patch, reserved (non-allocatable) CPUs
are evaluated by performing a difference between all the CPUs on a system
(determined by using ghw) and allocatable CPUs (determined by querying
GetAllocatableResources podResource API endpoint).

When aggregator creates the NUMA zones, it will skip the zone creation if
there are no allocatable resources. In this update we creates those missing
zone with zero allocatable/available resources so we won't have holes in the
array of reported zones.

Co-Authored-by: Talor Itzhak <titzhak@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-09-21 10:48:10 +01:00
Francesco Romani
b4c92e4eed topologyupdater: Bootstrap nfd-topology-updater in NFD
- This patch allows to expose Resource Hardware Topology information
  through CRDs in Node Feature Discovery.
- In order to do this we introduce another software component called
  nfd-topology-updater in addition to the already existing software
  components nfd-master and nfd-worker.
- nfd-master was enhanced to communicate with nfd-topology-updater
  over gRPC followed by creation of CRs corresponding to the nodes
  in the cluster exposing resource hardware topology information
  of that node.
- Pin kubernetes dependency to one that include pod resource implementation
- This code is responsible for obtaining hardware information from the system
  as well as pod resource information from the Pod Resource API in order to
  determine the allocatable resource information for each NUMA zone. This
  information along with Costs for NUMA zones (obtained by reading NUMA distances)
  is gathered by nfd-topology-updater running on all the nodes
  of the cluster and propagate NUMA zone costs to master in order to populate
  that information in the CRs corresponding to the nodes.
- We use GHW facilities for obtaining system information like CPUs, topology,
  NUMA distances etc.
- This also includes updates made to Makefile and Dockerfile and Manifests for
  deploying nfd-topology-updater.
- This patch includes unit tests
- As part of the Topology Aware Scheduling work, this patch captures
  the configured Topology manager scope in addition to the Topology manager policy.
  Based on the value of both attribues a single string will be populated to the CRD.
  The string value will be on of the following {SingleNUMANodeContainerLevel,
  SingleNUMANodePodLevel, BestEffort, Restricted, None}

Co-Authored-by: Artyom Lukianov <alukiano@redhat.com>
Co-Authored-by: Francesco Romani <fromani@redhat.com>
Co-Authored-by: Talor Itzhak <titzhak@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-09-21 10:47:39 +01:00
Francesco Romani
00cc07da76 topologyupdater: gRPC API definition
Setup the topologyupdater API for gRPC communication of
nfd-topology-updater with master

We generate pb.go file to reflect latest dependency changes
using github.com/golang/protobuf/protoc-gen-go and generate
grpc files via:
`protoc pkg/topologyupdater/topology-updater.proto --go_out=plugins=grpc:.`

Please refer to: https://github.com/k8stopologyawareschedwg/noderesourcetopology-api/blob/master/pkg/apis/topology/v1alpha1/types.go

Co-Authored-by: Artyom Lukianov <alukiano@redhat.com>
Co-Authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-09-21 10:47:39 +01:00
Markus Lehtonen
321d07e338 go.mod: update golang.org/x/net 2021-08-12 10:54:58 +03:00
Markus Lehtonen
136e262799 go.mod: update kubernetes to v1.22.0
Update k/k to v1.22.0 and sync all related dependencies with it.
2021-08-12 10:28:57 +03:00
Carlos Eduardo Arango Gutierrez
824d0d9b29
Merge branch 'master' into devel/makefile-apigen 2021-08-04 11:04:41 -05:00
Mikko Ylinen
c64493cae4 go.mod: update to klauspost/cpuid/v2@v2.0.9
The new version adds AVX512-FP16 (half precision floating-point)
support available on Sapphire Rapids.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-08-02 13:49:25 +03:00
Markus Lehtonen
69f89b249c go.mod: update dependencies 2021-07-07 16:01:10 +03:00
Markus Lehtonen
ad7df36a08 go.mod: update dependencies 2021-07-06 14:40:29 +03:00
Markus Lehtonen
44a385aff5 go.mod: update kubernetes to v1.21.2 2021-07-06 14:02:04 +03:00
Markus Lehtonen
307299f932 go.mod: update and tidy some non-k8s dependencies 2021-03-30 21:51:10 +03:00
Markus Lehtonen
7236759047 go.mod: update k8s to v1.20.5 2021-03-30 21:51:01 +03:00
Carlos Eduardo Arango Gutierrez
89e6a9104f
Update gogo/protobuf and golang.org/x/text
Why:

- CVE-2020-28851
- CVE-2021-3121

Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>
2021-02-10 16:13:51 -05:00
Mikko Ylinen
07bc50d5a8 go.mod: update to klauspost/cpuid/v2@v2.02
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-12-15 15:56:15 +02:00
Mikko Ylinen
94f49b9418 go.mod: update klauspost/cpuid
The latest changes in klauspost/cpuid add detection for Sapphire Rapids
new instructions.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-11-30 19:04:41 +02:00
Markus Lehtonen
95ff300d74 nfd-master: patch node object instead of rewriting it
When updating node labels and annotations use JSON patches instead of
doing a read-modify-write on the whole node object. Patching is already
being used in managing extended resources so some of the existing code
was re-usable.

This patch should mitigate the problem of node update failures caused by
race conditions (a change in the node object between our read and write)
resulting e.g. in errors/restarts in nfd worker pods.
2020-11-24 12:45:06 +02:00
Artyom Lukianov
6172c648e2 Update k8s dependencies to 1.19.4
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2020-11-18 13:09:13 +02:00
Carlos Eduardo Arango Gutierrez
aa7e0e1dc9
Update go dep golang.org/x/text
Update go dep golang.org/x/text

    CVE-2020-14040

golang.org/x/text: possibility to trigger an infinite loop in
encoding/unicode could lead to crash
Reported on Bug 1853652

Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>
2020-08-28 10:27:14 -05:00
Markus Lehtonen
6344656d54 go.mod: tidy 2020-05-19 11:05:02 +03:00
Markus Lehtonen
dbaf057525 go.mod: update test dependencies 2020-02-05 19:35:41 +02:00
Markus Lehtonen
974310251c test/e2e: adapt new wireframe to nfd context
Adapt the end-to-end test wireframe (copied from Kubernetes in the
previous commit) to node-feature-discovery.
2020-02-05 19:35:41 +02:00
Markus Lehtonen
2ad4ab708c go.mod: update kubernetes and its deps to v1.17.2 2020-02-05 17:01:53 +02:00
Antti Kervinen
d3d13347f8 vendor: update klauspost/cpuid
Update cpuid from v1.2.2 to v1.2.3. Brings in SGX improvements and
CPUID leaf 7 feature detection (VBMI2, VPOPCNTDQ, GFNI, VAES,
AVX512BITALG, VPCLMULQDQ, AVX512BF16, AVX512VP2INTERSECT). Blacklist
cpuid-SGX* (issue #130).

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2020-01-29 14:51:44 +02:00