1
0
Fork 0
mirror of https://github.com/kubernetes-sigs/node-feature-discovery.git synced 2024-12-14 11:57:51 +00:00
node-feature-discovery/pkg/nfd-master
Markus Lehtonen a2068f7ce3 nfd-master: tweak list options for NodeFeature informer
Fix cache syncing problems on big clusters with thousands of NodeFeature
objects.

On the initial list (sync) the client-go cache reflector sets the
ResourceVersion to "0" (instead of leaving it empty). This causes
problems in the api server with (apiserver) logs like:

E writers.go:122] apiserver was unable to write a JSON response: http:
                  Handler timeout
E status.go:71] apiserver received an error that is not an
                metav1.Status: &errors.errorString{s:"http: Handler timeout"}:
                http: Handler timeout

On the nfd-master side we see corresponding log snippets like:

W reflector.go:547] failed to list *v1alpha1.NodeFeature: stream error
                    when reading response body, may be caused by closed
                    connection. Please retry. Original error: stream
                    error: stream ID 1521; INTERNAL_ERROR; received from
                    peer
I trace.go:236] "Reflector ListAndWatch" name:*** (***) (total time:
                61126ms): ---"Objects listed" error:stream error when
                reading response body, may be caused by closed
                connection. Please retry. Original error: stream
                error: stream ID 1521; INTERNAL_ERROR; received from
                peer 61126ms (***)

Decreasing the page size (opts.Limits) does not have any effect on the
timeouts. However, setting ResourceVersion to an empty value seems to
get the paging on its tracks, eliminating the timeouts.

TODO: investigate in Kubernetes upstream the root cause of the timeouts
with ResourceVersion="0".
2024-07-25 16:29:05 +03:00
..
metrics.go Add NodeFeatureGroup CRD 2024-05-23 16:34:08 +02:00
nfd-api-controller.go nfd-master: tweak list options for NodeFeature informer 2024-07-25 16:29:05 +03:00
nfd-api-controller_test.go Move NFD api to a separate go mod 2024-04-05 16:35:47 +02:00
nfd-master-internal_test.go Add NodeFeatureGroup CRD 2024-05-23 16:34:08 +02:00
nfd-master.go Drop the -enable-nodefeature-api flag 2024-07-10 15:20:07 +03:00
nfd-master_test.go nfd-master: parse kubeconfig even with NoPublish set 2024-04-08 14:25:27 +03:00
updater-pool.go Add NodeFeatureGroup CRD 2024-05-23 16:34:08 +02:00
updater-pool_test.go Add NodeFeatureGroup CRD 2024-05-23 16:34:08 +02:00