Skip to content

Conversation

giorio94
Copy link
Member

Reduce the number of allocations (and consequently CPU time) performed by the correlateEndpoints function removing a duplicated DeepCopy of all endpoints. Indeed, GetEndpoints already returns a DeepCopy, and we can subsequently mutate the endpoint information in-place, without needing to copy them again. Additionally, let's rework a bit the logic
to avoid unnecessarily creating a second map and copy all entries there.

Considering BenchmarkCorrelateEndpoints which calls correlateEndpoints for a service associated with 10 epslices, each with 100 endpoints (2 ports each), these changes lead to approximately a 50% CPU and memory allocations reduction:

goos: linux
goarch: amd64
pkg: github.com/cilium/cilium/pkg/k8s
cpu: 13th Gen Intel(R) Core(TM) i7-13800H
                      │   old.txt    │               new.txt               │
                      │    sec/op    │   sec/op     vs base                │
CorrelateEndpoints-20   1140.8µ ± 4%   605.0µ ± 8%  -46.96% (p=0.000 n=10)

                      │    old.txt    │               new.txt                │
                      │     B/op      │     B/op      vs base                │
CorrelateEndpoints-20   1177.9Ki ± 0%   589.0Ki ± 0%  -50.00% (p=0.000 n=10)

                      │   old.txt    │               new.txt               │
                      │  allocs/op   │  allocs/op   vs base                │
CorrelateEndpoints-20   10.062k ± 0%   5.031k ± 0%  -50.00% (p=0.000 n=10)

In preparation for the subsequent changes, let's introduce a dedicated
benchmark for correlateEndpoints, so that we can measure the improvements.
Specifically, the benchmark initializes a service associated with 10
epslices, each including 100 endpoints (2 ports each), and then repeatedly
calls correlateEndpoints measuring CPU time and memory allocations.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Reduce the number of allocations (and consequently CPU time) performed
by the correlateEndpoints function removing a duplicated DeepCopy of
all endpoints. Indeed, GetEndpoints already returns a DeepCopy, and we
can subsequently mutate the endpoint information in-place, without
needing to copy them again. Additionally, let's rework a bit the logic
to avoid unnecessarily creating a second map and copy all entries there.

Considering BenchmarkCorrelateEndpoints which calls correlateEndpoints
for a service associated with 10 epslices, each with 100 endpoints (2
ports each), these changes lead to approximately a 50% CPU and memory
allocations reduction:

goos: linux
goarch: amd64
pkg: github.com/cilium/cilium/pkg/k8s
cpu: 13th Gen Intel(R) Core(TM) i7-13800H
                      │   old.txt    │               new.txt               │
                      │    sec/op    │   sec/op     vs base                │
CorrelateEndpoints-20   1140.8µ ± 4%   605.0µ ± 8%  -46.96% (p=0.000 n=10)

                      │    old.txt    │               new.txt                │
                      │     B/op      │     B/op      vs base                │
CorrelateEndpoints-20   1177.9Ki ± 0%   589.0Ki ± 0%  -50.00% (p=0.000 n=10)

                      │   old.txt    │               new.txt               │
                      │  allocs/op   │  allocs/op   vs base                │
CorrelateEndpoints-20   10.062k ± 0%   5.031k ± 0%  -50.00% (p=0.000 n=10)

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
@giorio94 giorio94 added area/daemon Impacts operation of the Cilium daemon. kind/performance There is a performance impact of this. release-note/misc This PR makes changes that have no direct user impact. area/loadbalancing Impacts load-balancing and Kubernetes service implementations labels Oct 29, 2024
@giorio94 giorio94 changed the title Mio/correlate endpoints Improve the performance of endpoints correlation in service cache Oct 29, 2024
@giorio94
Copy link
Member Author

/test

@giorio94 giorio94 marked this pull request as ready for review October 29, 2024 09:14
@giorio94 giorio94 requested a review from a team as a code owner October 29, 2024 09:14
@giorio94 giorio94 requested a review from tommyp1ckles October 29, 2024 09:14
Copy link
Contributor

@tommyp1ckles tommyp1ckles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 🚀

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 30, 2024
@julianwiedmann julianwiedmann added this pull request to the merge queue Oct 30, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Oct 30, 2024
@giorio94
Copy link
Member Author

Apparently kicked out of the merge queue due to the ongoing GitHub Actions incident: https://www.githubstatus.com/

@julianwiedmann julianwiedmann added this pull request to the merge queue Oct 30, 2024
Merged via the queue into cilium:main with commit 3b59c26 Oct 30, 2024
70 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/daemon Impacts operation of the Cilium daemon. area/loadbalancing Impacts load-balancing and Kubernetes service implementations kind/performance There is a performance impact of this. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants