Skip to content

Conversation

pippolo84
Copy link
Member

gops and pprof do not use neither the same protocol nor the same port to collect profile data.

When cilium-bugtool queries the gops server embedded in the agent, it collects the data calling the gops cli from the shell, without passing any extra port argument: https://github.com/cilium/cilium/blob/master/bugtool/cmd/root.go#L499-L506

Instead, when querying the pprof debug endpoints, cilium-bugtool should use the proper agent default pprof port: https://github.com/cilium/cilium/blob/master/bugtool/cmd/root.go#L478-L497

This PR changes the default port for pprof endpoints in cilium-bugtool to be the same default port set in the agent, even if pprof support is disabled by default in the agent.

Besides, clustermesh-apiserver does not support pprof yet, but only gops. Thus, the help message for the pprof port option in cilium-bugtool is fixed accordingly.

Fixes: #17004

@pippolo84 pippolo84 added release-note/bug This PR fixes an issue in a previous release of Cilium. area/bugtool Impacts gathering of data for debugging purposes. needs-backport/1.11 labels Sep 29, 2022
@pippolo84 pippolo84 requested review from a team as code owners September 29, 2022 09:43
@pippolo84
Copy link
Member Author

Forgot to update command reference, I'll fix it.

gops and pprof do not use the same protocol to collect profile data.
Thus, the default port for pprof debug endpoints in `cilium-bugtool`
should not be the one used for gops, but the default one for pprof
itself.

Besides, clustermesh-apiserver does not support pprof yet, but only
gops. Thus, the help message for the pprof port option in cilium-bugtool
is fixed accordingly.

Fixes: #416319b1cd (bugtool: Default to the agent's gops port)

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
@pippolo84 pippolo84 force-pushed the pr/pippolo84/fix-bugtool-pprof-default-port branch from d015a7f to 9e1917d Compare September 29, 2022 11:27
@pippolo84
Copy link
Member Author

/test

@pippolo84
Copy link
Member Author

pippolo84 commented Sep 29, 2022

/test-1.16-4.9

Job 'Cilium-PR-K8s-1.25-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathConfig Check BPF masquerading with ip-masq-agent DirectRouting

Failure Output

FAIL: Failed to add ip route

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.25-kernel-net-next so I can create one.

Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find. I always get confused between gops and pprof for some reason.

@pippolo84
Copy link
Member Author

pippolo84 commented Sep 29, 2022

/test-1.25-net-next

Job 'Cilium-PR-K8s-1.25-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathEgressGatewayTest tunnel disabled with endpointRoutes enabled and XDP no egress gw policy connectivity works

Failure Output

FAIL: Expected command: kubectl exec -n kube-system log-gatherer-m85rr -- curl --path-as-is -s -D /dev/stderr --fail --connect-timeout 5 --max-time 20 http://192.168.56.11:20080 -w "time-> DNS: '%{time_namelookup}(%{remote_ip})', Connect: '%{time_connect}',Transfer '%{time_starttransfer}', total '%{time_total}'" 

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.25-kernel-net-next so I can create one.

@pippolo84
Copy link
Member Author

Test failures are unrelated, reviews are in, marking ready to merge.

@pippolo84 pippolo84 added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 4, 2022
@ti-mo ti-mo merged commit b33adeb into cilium:master Oct 4, 2022
@sayboras sayboras mentioned this pull request Oct 8, 2022
8 tasks
@sayboras sayboras mentioned this pull request Oct 8, 2022
3 tasks
@sayboras sayboras added backport-pending/1.11 backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. and removed needs-backport/1.11 labels Oct 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/bugtool Impacts gathering of data for debugging purposes. backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants