-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
Is there an existing issue for this?
- I have searched the existing issues
What happened?
We are currently running Cilium on EKS in chaining mode with aws-cni. We are utilizing clustermesh across 6 clusters and the configuration and installation is managed by FluxCD. Our application has an auth service that lives in one of the six clusters and proxy service that lives in all clusters. The proxies connect to each other in a mesh as well as initiate a connection to the auth service. The proxies rely on a global service to connect to auth, they dial each other directly via a different discovery mechanism.
Single cluster upgrades don't appear to be an issue but if multiple clusters upgrade close enough together we observe connection issues between proxies in those clusters. Specifically we are looking at the metric hubble_drop_total{reason="Policy denied"}
as well as metric on our application that indicates the number of other proxies a proxy is connected to. Looking at the logs of the cilium agent of the node where affected proxies were running it seems like it isn't uncommon to see a connection to etcd, disconnect, and reconnect. It is fairly reproducible by running kubectl rollout restart deployment clustermesh-apiserver
in cluster1 and kubectl rollout restart ds cilium
in cluster2.
I considered trying to split the cilium install and the clustermesh install into separate helm installs as that would give us control when they would restart and not happen at the same time. Unfortunately, the cilium-ca
secret is expected to be created every time and would cause a conflict (there may be other but this was the first roadblock I hit, could potentially get around this by allowing users to specify this).
We are hoping to better understand the following:
- Expectations of upgrades as the upgrade guide states
Minimal to None
affect on L3/L4 and if we are hitting that or if we are hitting something else. - Is it expected that Cilium upgrades in clustermesh happen one cluster at a time? Our FluxCD reconciles every 10 min and currently have no coordination that at the moment but are looking at options for this.
- Would we be better off configuring our cilium install to be backed by etcd? That seems like it would remove the possibility of etcd connections being broken on cilium startup since there wouldn't be a
clustermesh-apiserver
pod being recreated. Is there documentation for this by chance? - Possibly separate the
etcd
container (and give it a PV) from theapiserver
container so that clustermesh isn't rolled at the same time? Any gotchas with this approach?
Thanks for all the work on this awesome CNI!
Cilium Version
1.12.3
Kernel Version
Linux ip-10-16-112-106.us-west-2.compute.internal 5.4.242-155.348.amzn2.x86_64 #1 SMP Mon May 8 12:52:40 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
Server Version: v1.23.17-eks-0a21954
Sysdump
Sysdump file was too large to upload
Relevant log output
level=info msg="Established connection to remote etcd" clusterName=redact config=/var/lib/cilium/clustermesh/redact kvstoreErr="<nil>" kvstoreStatus="etcd: 1/1 connected, lease-ID=0, lock lease-ID=0, has-quorum=timeout while waiting for initial connection, consecutive-errors=1: https://redact:2379 - 3.5.4 (Leader)" subsys=clustermesh
Anything else?
No response
Code of Conduct
- I agree to follow this project's Code of Conduct