Skip to content

Conversation

giorio94
Copy link
Member

@giorio94 giorio94 commented Mar 21, 2025

Extend the connectivity perf command with an extra option to automatically capture kernel profiles during the execution of each performance test, for later inspection. The resulting profiles, saved into the result directory, can be converted into an interactive SVG file via (see [1] for more info):

  stackcollapse-perf.pl $file | flamegraph.pl > out.svg

Note that the kernel profiles capture shall be enabled only when running against disposable nodes dedicated for this scope. Indeed, the capture makes use of privileged pods to escape into the underlying host, install the necessary additional software and eventually run the appropriate perf commands. perf needs to be installed on the host, rather than in a container, as the version is specific to the underlying kernel.

Additionally, enable this new feature in the net-perf-gke and scale-egw workflows.

Currently builds on top of #38245

[1]: https://github.com/brendangregg/FlameGraph

@giorio94 giorio94 added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. cilium-cli This PR contains changes related with cilium-cli labels Mar 21, 2025
@giorio94
Copy link
Member Author

/scale-egw

@giorio94
Copy link
Member Author

/net-perf-gke

@giorio94 giorio94 force-pushed the pr/giorio94/main/cli-connectivity-perf-kernel-profiles branch from 5d7b2ee to 19130f2 Compare March 21, 2025 18:04
@giorio94
Copy link
Member Author

/test

@giorio94 giorio94 marked this pull request as ready for review March 24, 2025 07:50
@giorio94 giorio94 requested review from a team as code owners March 24, 2025 07:50
Copy link
Contributor

@marseel marseel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! ❤️

Tested on an EKS cluster with various Cilium configurations.

@pchaigno pchaigno removed the request for review from viktor-kurchenko March 24, 2025 11:26
@giorio94 giorio94 enabled auto-merge March 24, 2025 13:02
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 26, 2025
Automatically create the report directory (when specified) if not already
existing, to ensure that we can appropriately store the results.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
That port configured there is never accessed, and it doesn't seem to
match the one used by netperf. Let's just remove it to prevent
confusion.

Related: 24caafb ("cli/perf: fix command and port of server deployment")
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
In preparation for the subsequent introduction of new types of pods,
let's slightly rework the categorization by introducing an explicit
role label.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Extend the connectivity perf command with an extra option to
automatically capture kernel profiles during the execution of
each performance test, for later inspection. The resulting
profiles, saved into the result directory, can be converted
into an interactive SVG file via (see [1] for more info):

  stackcollapse-perf.pl $file | flamegraph.pl > out.svg

Note that the kernel profiles capture shall be enabled only when
running against disposable nodes dedicated for this scope. Indeed,
the capture makes use of privileged pods to escape into the
underlying host, install the necessary additional software and
eventually run the appropriate `perf` commands. `perf` needs to
be installed on the host, rather than in a container, as the
version is specific to the underlying kernel.

[1]: https://github.com/brendangregg/FlameGraph

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Enable the new Cilium CLI option to automatically capture kernel
profiles during performance tests in the net-perf-gke workflow.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Enable the new Cilium CLI option to automatically capture kernel
profiles during performance tests in the scale-egw workflow.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
@giorio94 giorio94 force-pushed the pr/giorio94/main/cli-connectivity-perf-kernel-profiles branch from 19130f2 to 3f3e85e Compare March 27, 2025 09:23
@giorio94 giorio94 removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 27, 2025
@giorio94
Copy link
Member Author

/test

@giorio94 giorio94 added this pull request to the merge queue Mar 27, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 27, 2025
Merged via the queue into main with commit aeff822 Mar 27, 2025
228 of 230 checks passed
@giorio94 giorio94 deleted the pr/giorio94/main/cli-connectivity-perf-kernel-profiles branch March 27, 2025 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cilium-cli This PR contains changes related with cilium-cli ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants