Skip to content

Conversation

giorio94
Copy link
Member

@giorio94 giorio94 commented May 31, 2024

The clustermesh-apiserver's etcd sidecar instance is by design stateless, as etcd data is stored in an emptyDir Kubernetes volume and not preserved upon restarts. Yet, let's expose to users the medium config, to allow creating a volume backed by RAM rather than node storage. This allows for greatly improved etcd read and write performance at the cost of additional memory usage, which counts against the memory limits of the container. Additional information is available in the upstream documentation [1].

[1]: https://kubernetes.io/docs/concepts/storage/volumes/#emptydir

Allow configuring RAM-backed clustermesh-apiserver's etcd storage for improved performance in high-scale/high-churn environments 

The clustermesh-apiserver's etcd sidecar instance is by design stateless,
as etcd data is stored in an emptyDir Kubernetes volume and not preserved
upon restarts. Yet, let's expose to users the medium config, to allow
creating a volume backed by RAM rather than node storage. This allows for
greatly improved etcd read and write performance at the cost of additional
memory usage, which counts against the memory limits of the container.
Additional information is available in the upstream documentation [1].

[1]: https://kubernetes.io/docs/concepts/storage/volumes/#emptydir

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
@giorio94 giorio94 added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. area/clustermesh Relates to multi-cluster routing functionality in Cilium. area/helm Impacts helm charts and user deployment experience labels May 31, 2024
@giorio94 giorio94 requested review from a team as code owners May 31, 2024 15:37
@giorio94 giorio94 requested review from youngnick, squeed and thorn3r May 31, 2024 15:37
@giorio94
Copy link
Member Author

/test

Copy link
Contributor

@thorn3r thorn3r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we should also add sizeLimit for the volume, but since etcd is critical to cluster operation I can't see a reason to limit it to protect any other workload.

@giorio94
Copy link
Member Author

giorio94 commented Jun 3, 2024

I was wondering if we should also add sizeLimit for the volume, but since etcd is critical to cluster operation I can't see a reason to limit it to protect any other workload.

Yep, I don't see a reason either, as etcd by itself already limits the maximum storage size (2GB by default); introducing external limits would most likely lead to very confusing behavior.

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 4, 2024
@julianwiedmann julianwiedmann added this pull request to the merge queue Jun 4, 2024
Merged via the queue into cilium:main with commit a64ec19 Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clustermesh Relates to multi-cluster routing functionality in Cilium. area/helm Impacts helm charts and user deployment experience ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants