1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00
Commit graph

48 commits

Author SHA1 Message Date
Kubernetes Prow Robot
4136a69545
Merge pull request #1715 from marquiz/devel/avx10-deprecate
source/cpu: disable AVX10 label
2024-05-24 04:53:59 -07:00
Markus Lehtonen
ece6076dd4 source/cpu: disable AVX10 label
Disable AVX10 as unnecessary as AVX10_LEVEL is better suited for
checking AVX10 compatibility. There is not yet any hardware with the
feature so disabling it shouldn't cause problems for users.
2024-05-24 13:50:46 +03:00
Markus Lehtonen
fa2f008d18 cpu: advertise AVX10 version
Add new cpuid label "feature.node.kubernetes.io/cpu-cpuid.AVX10_VERSION"
that advertises the supported version of AVX10 vector ISA.
Correspondingly, the patch adds AVX10_VERSION to the "cpu.cpuid" feature
for NodeFeatureRules to consume.

This makes cpu.cpuid on amd64 architecture a "multi-type" feature in
that it contains "flags" and potentially also "attributes" (the only
cpuid attribute so far is the AVX10_VERSION).
2024-05-24 13:48:20 +03:00
Carlos Eduardo Arango Gutierrez
3434557d7c
Move NFD api to a separate go mod
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2024-04-05 16:35:47 +02:00
Kubernetes Prow Robot
2694448a7f
Merge pull request #1530 from marquiz/devel/rdt
source/cpu: drop deprecated cpu-rdt labels
2024-01-16 11:08:18 +01:00
Markus Lehtonen
cd18fe8970 source/cpu: drop deprecated cpu-rdt labels
Drop RDT labels that were deprecated in NFD v0.13. The RDT features
remain available for NodeFeatureRules to serve custom labeling.
2023-12-22 17:29:00 +02:00
AhmedGrati
f962698c14 chore: combine cpu count and thread_siblings functions into discover topology function
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-12-18 16:29:38 +01:00
AhmedGrati
ebb08369d3 feat: add cpu socket number in cpu.topology
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
2023-12-15 11:04:42 +01:00
Markus Lehtonen
c126764d7a cpu: drop the deprecated sgx and se labels
Drop the deprecated cpu-sgx.enabled and cpu-se.enabled labels and the
corresponding "raw" features. These have been replaced by
cpu-security.sgx.enabled and cpu-security.se.enabled.
2023-09-08 14:28:04 +03:00
Hairong Chen
e8a00ba7da cpu: Discover TDX guests based on cpuid information
NFD already has the capability to discover whether baremetal / host
machines support Intel TDX.  Now, the next step is to add support for
discovering whether a node is TDX protected (as in, a virtual machine
started using Intel TDX).

In order to do so, we've decided to go for a new `cpu-security.tdx`
property, called `protected` (`cpu-security.tdx.protected`).

Signed-off-by: Hairong Chen <hairong.chen@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-06-05 11:06:28 +02:00
Markus Lehtonen
bf670de68d pkg/utils: migrate KlogDump to structured logging
Drop the KlogDump helper in favor of klog.InfoS. However, that patch
introduces a new DelayedDumper() helper to avoid processing
(marshalling) of object unless really evaluated by the logging function.
2023-05-31 14:43:08 +03:00
Markus Lehtonen
fe267a634b source: migrate to structured logging
The custom.d config file parsing is made a bit less verbose.
2023-05-31 14:43:08 +03:00
Carlos Eduardo Arango Gutierrez
05ef5d4e9d
cpu: expose the total number of AMD SEV ASID and ES
This patch add SEV ASIDs and the related (but distinct) SEV Encrypted State
(SEV-ES) IDs as two quantities to be exposed via extended resources.
In a kernel built with CONFIG_CGROUP_MISC on a suitably equipped AMD CPU, the
root control group will have a misc.capacity file that shows the number of
available IDs in each category.

The added extended resources are:
- sev.asids
- sev.encrypted_state_ids

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
2023-04-17 19:34:39 +02:00
Mikko Ylinen
de1b69a8bf cpu: make SGX EPC resource available to NodeFeatureRules
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2023-04-14 15:31:54 +03:00
Markus Lehtonen
3320c74472 source/cpu: don't create cpu-security.tdx.total_keys label
Just have that as a feature for NodeFeatureRules to consume.
2023-04-14 13:33:13 +03:00
PiotrProkop
0e78eba40e Advertise RDT L3 num_closid
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2023-04-06 11:22:55 +02:00
Carlos Eduardo Arango Gutierrez
7171cfd4eb
cpu: expose AMD SEV support
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
2023-03-30 15:19:43 +02:00
Chandan Abhyankar
d66096a491 cpu: support for detecting nx-gzip coprocessor feature
Nest accelerator gzip support for IBM Power systems.

Signed-off-by: Chandan Abhyankar <Chandan.Abhyankar@ibm.com>
2023-01-17 23:18:16 -08:00
Markus Lehtonen
b907d07d7e apis/nfd: flatten the structure of features data type
Flatten the data structure that stores features, dropping the "domain"
level from the data model. That extra level of hierarchy brought little
benefit but just caused some extra complexity, instead. The new
structure nicely matches what we have in the NodeFeatureRule object (the
matchFeatures field of uses the same flat structure with the "feature"
field having a value <domain>.<feature>, e.g. "kernel.version").

This is pre-work for introducing a new "node feature" CRD that contains
the raw feature data. It makes the life of both users and developers
easier when both CRDs, plus our internal code, handle feature data in a
similar flat structure.
2022-10-18 18:37:28 +03:00
Markus Lehtonen
0e1d4a9046 apis/nfd: migrate pkg/api/feature
Move the previously-protobuf-only internal "feature api" over to the
public "nfd api" package. This is in preparation for introducing a new
CRD API for communicating features.

This patch carries no functional changes. Just moving code around.
2022-10-15 07:42:20 +03:00
Markus Lehtonen
a00cdc2b61 pkg/utils: move hostpath helpers from source to utils
Refactor the code, moving the hostpath helper functionality to new
"pkg/utils/hostpath" package. This breaks odd-ish dependency
"pkg/utils" -> "source".
2022-10-06 14:28:24 +03:00
Markus Lehtonen
abdbd420d1 pkg/api/feature: rename types
Sync type names with NFD documentation. Aims at making the codebase
easier to follow.
2022-10-06 11:25:01 +03:00
Markus Lehtonen
12e859d50c Drop deprecated io/ioutil package
Makes golanci-lint happy.
2022-09-08 14:26:02 +03:00
Markus Lehtonen
f62b057bcd cpu: re-organize security features
Move existing security/trusted-execution related features (i.e. SGX and
SE) under the same "security" feature, deprecating the old features. The
motivation for the change is to keep the source code and user interface
more organized as we experience a constant inflow of similar security
related features. This change will affect the user interface so it is
less painful to do it early on.

New feature labels will be:

  feature.node.kubernetes.io/cpu-security.se.enabled
  feature.node.kubernetes.io/cpu-security.sgx.enabled

and correspondingly new "cpu.security" feature with "se.enabled" and
"sgx.enabled" elements will be available for custom rules, for example:

      - name: "sample sgx rule"
        labels:
          sgx.sample.feature: "true"
        matchFeatures:
          - feature: cpu.security
            matchExpressions:
              "sgx.enabled": {op: IsTrue}

At the same time deprecate old labels "cpu-sgx.enabled" and
"cpu-se.enabled" feature labels and the corresponding features for
custom rules. These will be removed in the future causing an effective
change in NFDs user interface.
2022-06-28 13:38:31 +03:00
Jakob Naucke
9e95dde38b
cpu: Discover IBM Secure Execution
Set `cpu.se-enabled` to `true` when IBM Secure Execution for Linux
(IBM Z & LinuxONE) is available and has been enabled.

Uses `/sys/firmware/uv/prot_virt_host`, which is available in kernels
>=5.12 + backports. For simplicity, skip more complicated facility &
kernel cmdline lookups.
2022-03-28 12:28:07 +02:00
Carlos Eduardo Arango Gutierrez
cb0a6fca53
Add cpu-model feature detection (#792)
* Add cpu-model feature detection

Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>

* Apply suggestions from code review

Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>

Co-authored-by: Markus Lehtonen <markus.lehtonen@intel.com>
2022-03-28 02:51:23 -07:00
Dipto Chakrabarty
19a57789ad
Additional Lint Fixes in Codebase (#779)
* fix comments and conditonals to fix lint issues

* more linter fixes and spelling fixes

* fix linter issues based on feedback
2022-03-02 17:12:46 -08:00
Dipto Chakrabarty
755294184c
Fix GoLinter Issues in the files (#711)
* fix linter issues for few files

* fix linter issue of exported const Name should have comment or be unexported

* fix name lint issue and resolve lints

* add changes to comments
2022-01-18 23:12:06 -08:00
Mikko Ylinen
8a39434659 source/cpu: detect Intel SGX
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-11-23 15:57:31 +02:00
Markus Lehtonen
03bf94a8ad source/cpu: implement FeatureSource
Convert the cpu source to do feature discovery and creation of feature
labels separately.

Move cpuidutils from source/internal to the source/cpu package. Change
the cpuid custom rule to utilize GetFeatures of the cpu source.

Also, add minimalist unit test.
2021-11-11 18:33:40 +02:00
Markus Lehtonen
852cf4b61d source: introduce FeatureSource interface
Specify a new interface for managing "raw" feature data. This is the
first step to separate raw feature data from node labels. None of the
feature sources implement this interface, yet.

This patch unifies the data format of "raw" features by dividing them
into three different basic types.
- keys, a set of names without any associated values, e.g. CPUID flags
  or loaded kernel modules
- values, a map of key-value pairs, for features with a single value,
  e.g. kernel config flags or os version
- instances, a list of instances each of which has multiple attributes
  (key-value pairs of their own), e.g. PCI or USB devices

The new feature data types are defined in a new "pkg/api/feature"
package, catering decoupling and re-usability of code e.g. within future
extentions of the NFD gRPC API.

Rename the Discover() method of LabelSource interface to GetLabels().
2021-09-20 09:58:07 +03:00
Markus Lehtonen
81378a3235 source: make sources register themselves
Implement new registration infrastructure under the "source" package.
This change loosens the coupling between label sources and the
nfd-worker, making it easier to refactor and move the code around.

Also, create a separate interface (ConfigurableSource) for configurable
feature sources in order to eliminate boilerplate code.

Add safety checks to the sources that they actually implement the
interfaces they should.

In sake of consistency and predictability (of behavior) change all
methods of the sources to use pointer receivers.

Add simple unit tests for the new functionality and include source/...
into make test target.
2021-09-15 18:41:37 +03:00
Markus Lehtonen
befa7e9796 source: rename FeatureSource to LabelSource
Prepare for separating feature detection from label creation.
2021-09-13 22:48:33 +03:00
Markus Lehtonen
43e0f83940 source/cpu: better error reporting
Drop confusing errors in the log when intel pstate or cstate driver is
not enabled in the system. However, we still log an error if sysfs is
not available at all, in which case we're not able to detect these
correctly.
2021-08-13 09:16:03 +03:00
Markus Lehtonen
31bd91988f cpuid: correct the name of SSE4* cpuid flags
The naming was changed in when with cpuid v2
(github.com/klauspost/cpuid/v2) and we didn't catch this in NFD. No
issue reports of the inadvertent naming change so let's just adapt to
the updated naming in NFD configuration. The SSE4* labels are disabled
by default so they're not widely used, if at all.
2021-07-06 11:54:55 +03:00
Markus Lehtonen
610b1c696c source: define source names as consts
Paves the way for future work on more general representation of
feature data and looser coupling of the data and feature source
interface.
2021-06-11 15:29:57 +03:00
Bob Fournier
a65f73e834 Support for additional cpu features
This adds additional cpu features:
- pstate status from status of intel_pstate driver
- pstate scaling settings from scaling_governor
- cstate enable from max_cstates in intel_idle driver
2021-03-05 13:15:49 -05:00
Markus Lehtonen
7da7fde8f6 nfd-worker: switch to klog
Greatly expands logging capabilities and flexibility with verbosity
options, among other things.
2021-02-25 16:10:43 +02:00
Mikko Ylinen
109da6c980 source/cpu: move cpuid code to a shared internal package
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2020-09-14 10:32:04 +03:00
Markus Lehtonen
a2b9df5cd3 nfd-worker: rework configuration handling
Extend the FeatureSource interface with new methods for configuration
handling. This enables easier on-the fly reconfiguration of the
feature sources. Further, it simplifies adding config support to feature
sources in the future. Stub methods are added to sources that do not
currently have any configurability.

The patch fixes some (corner) cases with the overrides (--options)
handling, too:
- Overrides were not applied if config file was missing or its parsing
  failed
- Overrides for a certain source did not have effect if an empty config
  for the source was specified in the config file. This was caused by
  the first pass of parsing (config file) setting a nil pointer to the
  source-specific config, effectively detaching it from the main config.
  The second pass would then create a new instance of the source
  specific config, but, this was not visible in the feature source, of
  course.
2020-05-21 00:59:37 +03:00
Markus Lehtonen
67d7887949 source: perform all sysfs discovery under host sysfs
Be consistent and do all sysfs based feature discovery under the same
sysfs directory.
2020-05-20 22:15:41 +03:00
Markus Lehtonen
674c9f71ed source/cpu: drop leftover debug print 2020-04-22 21:29:19 +03:00
Antti Kervinen
d3d13347f8 vendor: update klauspost/cpuid
Update cpuid from v1.2.2 to v1.2.3. Brings in SGX improvements and
CPUID leaf 7 feature detection (VBMI2, VPOPCNTDQ, GFNI, VAES,
AVX512BITALG, VPCLMULQDQ, AVX512BF16, AVX512VP2INTERSECT). Blacklist
cpuid-SGX* (issue #130).

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2020-01-29 14:51:44 +02:00
Markus Lehtonen
882bbeea3f source/cpu: support 'false' status of cpu-pstate.turbo
Some workloads may benefit from Intel Turbo Boost technology being
disabled. This patch sets the
'feature.node.kubernetes.io/cpu-pstate.turbo' label to 'false' if we can
detect that it has been disabled. If detection fails no label is
published.
2019-08-29 16:18:12 +03:00
Markus Lehtonen
7c5f7d600e source/cpu: make cpuid configurable
Add 'cpuid/attributeBlacklist' and 'cpuid/attributeWhitelist' config
options for the cpu feature source. These can be used to filter the set
of cpuid capabilities that get published. The intention is to reduce
clutter in the NFD label space, getting rid of "obvious" or misleading
cpuid labels. Whitelisting has higher priority, i.e.  only whitelist
takes effect if both attributeWhitelist and attributeBlacklist are
specified.
2019-05-13 17:17:02 +03:00
Markus Lehtonen
655f5c5555 sources: move all cpu related features under the cpu source
Remove 'cpuid', 'pstate' and 'rdt' feature sources and move their
functionality under the 'cpu' source. The goal is to have a more
systematic organization of feature sources and labels. After this change
we now basically have one source per type of hw, one for kernel and one
for userspace sw.

Related feature labels are changed, correspondingly, new labels being:
    feature.node.k8s.io/cpu-cpuid.<cpuid flag>
    feature.node.k8s.io/cpu-pstate.turbo
    feature.node.k8s.io/cpu-rdt.<rdt feature>
2019-05-09 20:18:36 +03:00
Markus Lehtonen
ad17e5088b source/cpu: detect SST-BF
Detect of the Intel SST-BF (Speed Select Technology - Base Frequency)
has been enabled.

Adds one new feature label:
  feature.node.kubernetes.io/cpu-power.sst_bf.enabled=true

Based on a patch from kuralamudhan.ramakrishnan@intel.com
2019-04-12 15:11:55 +03:00
Markus Lehtonen
da2cb07c64 Implement cpu feature source
Currently, it only detects one feature, i.e. hardware multithreading
(such as Intel hyper-threading technology). The corresponding feature
label is:
  feature.node.kubernetes.io/cpu-hardware_multithreading=true

However, this (architecture/platform dependent) feature is not detected
directly, and, the heuristics can be mislead. Detection works by
checking the thread siblings of each logical (and online) cpu in the
system. If any cpu has any thread siblings the feature label is set to
true. Thus, hardware multithreading could be effectively disabled e.g.
by putting all sibling cpus offline (even if the technology would be
enabled in hardware).
2018-12-07 16:58:09 +02:00