Skip to content

Port bpftool probing and usage metrics collection to Go #39557

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 20, 2025

Conversation

ti-mo
Copy link
Contributor

@ti-mo ti-mo commented May 15, 2025

This commit converts the last bpftool shellout of the Cilium agent to a
Go-based equivalent. See doc comments for more details. The new implementation
makes collecting bpf metrics an order of magnitude faster, especially on
smaller deployments where the startup cost of two bpftool invocations doesn't
amortize well.

`bpftool {map,prog} show` does a lot of work behind the scenes: it iterates all
objects, reads and parses associated btf, gets object info, fetches some
associated data (pid comm) and then pushes everything to stdout. After all this,
the agent needs to read and decode the output.

The new implementation takes a different approach: it iterates all programs
and considers only the ones with a name starting with cil_ or tail_ (all prog
names will be made consistent in a follow-up commit) and recursively opens
associated maps, taking care not to visit the same object more than once during
a single collection round.

An initial attempt at discovering tail calls by iterating associated prog
arrays proved much too slow and unscalable, so this idea was shelved.

Fixes: #38979

Collect bpf-related metrics using native Go code instead of bpftool. Users should see a significant speedup in /metrics scraping as well as a significant decrease in CPU usage related to bpf metrics collection on both large and small Cilium deployments.

@ti-mo ti-mo requested review from a team as code owners May 15, 2025 13:49
@ti-mo ti-mo added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label May 15, 2025
@ti-mo
Copy link
Contributor Author

ti-mo commented May 15, 2025

/test

ti-mo added a commit to ti-mo/cilium that referenced this pull request May 15, 2025
Since 7b92700 and e9438c2, egress policy programs are inserted
unconditionally, even when the L7 proxy is disabled. Unfortunately, the
teardown logic only runs when the proxy is enabled, causing all
handle_policy_egress programs to leak.

Double unfortunately, since this program doesn't reference any Cilium maps in
its attached-but-disabled state (it's an empty program), it doesn't show up in
Cilium's maps metrics, making the issue hard to spot. The latter is being
addressed in cilium#39557.

This commit makes the removal unconditional, since the attachment is
also unconditional. I've decided against making both attachment and detachment
conditional, since that would still cause orphaned programs if the l7 proxy
was disabled on a running cluster. Endpoint teardown should always attempt to
remove all known policy programs for robustness.

Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo ti-mo force-pushed the tb/remove-bpftool-metrics branch from 9965054 to bb2b1a2 Compare May 15, 2025 15:35
@ti-mo
Copy link
Contributor Author

ti-mo commented May 15, 2025

/test

ti-mo added 3 commits May 19, 2025 14:50
A follow-up commit needs a new API for accessing program memlock info.

Signed-off-by: Timo Beckers <timo@isovalent.com>
This commit converts the last bpftool shellout of the Cilium agent to a
Go-based equivalent. See doc comments for more details. The new implementation
makes collecting bpf metrics an order of magnitude faster, especially on
smaller deployments where the startup cost of two bpftool invocations doesn't
amortize well.

`bpftool {map,prog} show` does a lot of work behind the scenes: it iterates all
objects, reads and parses associated btf, gets object info, fetches some
associated data (pid comm) and then pushes everything to stdout. After all this,
the agent needs to read and decode the output.

The new implementation takes a different approach: it iterates all programs
and considers only the ones with a name starting with cil_ or tail_ (all prog
names will be made consistent in a follow-up commit) and recursively opens
associated maps, taking care not to visit the same object more than once during
a single collection round.

An initial attempt at discovering tail calls by iterating associated prog
arrays proved much too slow and unscalable, so this idea was shelved.

Signed-off-by: Timo Beckers <timo@isovalent.com>
Due to a prior commit, all Cilium programs now need to either have the cil_ or
tail_ prefixes for entry points and tail calls respectively. This enables
straightforward metrics collection.

Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo ti-mo force-pushed the tb/remove-bpftool-metrics branch from bb2b1a2 to 0482b98 Compare May 19, 2025 12:52
@ti-mo
Copy link
Contributor Author

ti-mo commented May 19, 2025

/test

Copy link
Contributor

@derailed derailed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ti-mo Nice work!

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 20, 2025
github-merge-queue bot pushed a commit that referenced this pull request May 20, 2025
Since 7b92700 and e9438c2, egress policy programs are inserted
unconditionally, even when the L7 proxy is disabled. Unfortunately, the
teardown logic only runs when the proxy is enabled, causing all
handle_policy_egress programs to leak.

Double unfortunately, since this program doesn't reference any Cilium maps in
its attached-but-disabled state (it's an empty program), it doesn't show up in
Cilium's maps metrics, making the issue hard to spot. The latter is being
addressed in #39557.

This commit makes the removal unconditional, since the attachment is
also unconditional. I've decided against making both attachment and detachment
conditional, since that would still cause orphaned programs if the l7 proxy
was disabled on a running cluster. Endpoint teardown should always attempt to
remove all known policy programs for robustness.

Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo ti-mo added this pull request to the merge queue May 20, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 20, 2025
@ti-mo ti-mo added this pull request to the merge queue May 20, 2025
Merged via the queue into cilium:main with commit 91a9a61 May 20, 2025
67 checks passed
@ti-mo ti-mo deleted the tb/remove-bpftool-metrics branch May 20, 2025 23:02
joamaki pushed a commit that referenced this pull request May 23, 2025
[ upstream commit 2d8769d ]
[ backporter's note: fixed minor conflict due to missing slog changes ]

Since 7b92700 and e9438c2, egress policy programs are inserted
unconditionally, even when the L7 proxy is disabled. Unfortunately, the
teardown logic only runs when the proxy is enabled, causing all
handle_policy_egress programs to leak.

Double unfortunately, since this program doesn't reference any Cilium maps in
its attached-but-disabled state (it's an empty program), it doesn't show up in
Cilium's maps metrics, making the issue hard to spot. The latter is being
addressed in #39557.

This commit makes the removal unconditional, since the attachment is
also unconditional. I've decided against making both attachment and detachment
conditional, since that would still cause orphaned programs if the l7 proxy
was disabled on a running cluster. Endpoint teardown should always attempt to
remove all known policy programs for robustness.

Signed-off-by: Timo Beckers <timo@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
github-merge-queue bot pushed a commit that referenced this pull request May 27, 2025
[ upstream commit 2d8769d ]
[ backporter's note: fixed minor conflict due to missing slog changes ]

Since 7b92700 and e9438c2, egress policy programs are inserted
unconditionally, even when the L7 proxy is disabled. Unfortunately, the
teardown logic only runs when the proxy is enabled, causing all
handle_policy_egress programs to leak.

Double unfortunately, since this program doesn't reference any Cilium maps in
its attached-but-disabled state (it's an empty program), it doesn't show up in
Cilium's maps metrics, making the issue hard to spot. The latter is being
addressed in #39557.

This commit makes the removal unconditional, since the attachment is
also unconditional. I've decided against making both attachment and detachment
conditional, since that would still cause orphaned programs if the l7 proxy
was disabled on a running cluster. Endpoint teardown should always attempt to
remove all known policy programs for robustness.

Signed-off-by: Timo Beckers <timo@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Port bpftool probing and usage metrics collection to Go
5 participants