-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
Is there an existing issue for this?
- I have searched the existing issues
Version
equal or higher than v1.16.0 and lower than v1.17.0
What happened?
Starting from v1.16, each Cilium agent starts two services and endpointslices informers, introducing extra load on the Kubernetes API Server. The main commits that appear to contribute to the regression are 99268e7, cce4080 and 0e8c5a6, because a few consumers leverage the filtered resources, while a few others the unfiltered ones.
The best fix is probably dropping the filtered resource altogether (assuming it is possible, adapting the consumers accordingly), to prevent regressions in the future due to the mixed usage of the two resources.
How can we reproduce the issue?
This issue can be validated checking the apiserver_longrunning_requests
Kubernetes API Server metric. For instance, on a four nodes kind cluster:
# Cilium agents are not running
$ kubectl get --raw /metrics | grep apiserver_longrunning_requests | egrep '"services|endpointslices"'
apiserver_longrunning_requests{component="apiserver",group="",resource="services",scope="cluster",subresource="",verb="WATCH",version="v1"} 13
apiserver_longrunning_requests{component="apiserver",group="discovery.k8s.io",resource="endpointslices",scope="cluster",subresource="",verb="WATCH",version="v1"} 7
# Cilium agents are running
$ kubectl get --raw /metrics | grep apiserver_longrunning_requests | egrep '"services|endpointslices"'
apiserver_longrunning_requests{component="apiserver",group="",resource="services",scope="cluster",subresource="",verb="WATCH",version="v1"} 21
apiserver_longrunning_requests{component="apiserver",group="discovery.k8s.io",resource="endpointslices",scope="cluster",subresource="",verb="WATCH",version="v1"} 15
Cilium Version
Cilium v1.16 and above
Code of Conduct
- I agree to follow this project's Code of Conduct