Skip to content

Conversation

ti-mo
Copy link
Contributor

@ti-mo ti-mo commented Oct 23, 2023

Note: this is obviously a hack. There's a long-standing issue with Cilium's
loader infrastructure where program arrays like cilium_calls_* are reused if
the map's properties in the ELF are compatible with the pinned version present
on the system. This has been the case since the early iproute2 days, and
ebpf-go has the same behaviour by default. This is unsound, but doesn't always
cause issues in practice.

However, as Cilium's eBPF code base grew, along with its contributor base, and
increasing code churn and the need to backport fixes, this method started
falling apart. Pinned tail call maps are actively used for handling traffic,
and should be considered fixed, as they are an integral part of the program's
control flow. When another BPF object needs to be attached to a kernel hook,
it's incorrect to replace entries in an existing program array used by a prior
version of the BPF object, since logic in some tail calls may have been enabled
or disabled in the new one, or may have been removed altogether.

A prog array's contents cannot be replaced atomically, meaning that while the
prog array is being repopulated by a new ELF, a packet may be processed by a
chain of programs that are partially old and partially new. This is bad.

This patch fixes the specific case of cilium_calls_*, since it's the only
prog array across all ELFs that consistently acts as the central tail call map.
The main motivation for this approach is backportability as far back as 1.12,
since we're seeing increased connectivity interruptions (leading to test flakes)
during up/downgrades and restarts in that branch. This is caused by other
backports creating code churn, as well as the stark difference in the contents
of tail calls between major Cilium versions.

This code will be replaced by a 'commit' system that does away with the re-
pinning behaviour and keeps new map fd's in memory; only committing them to
bpffs when the necessary endpoint prog(s) were successfully replaced.

Fixes: #20691

Always replace the cilium_call_* tail call map during upgrade/restart to avoid "Missed tail call" errors

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 23, 2023
@ti-mo ti-mo added kind/bug This is a bug in the Cilium logic. needs-backport/1.12 backport/author The backport will be carried out by the author of the PR. labels Oct 23, 2023
@ti-mo ti-mo changed the title Tb/always migrate cilium calls Always migrate cilium_calls_* during ELF load Oct 23, 2023
@ti-mo
Copy link
Contributor Author

ti-mo commented Oct 23, 2023

/test

@ti-mo ti-mo added release-note/bug This PR fixes an issue in a previous release of Cilium. and removed kind/bug This is a bug in the Cilium logic. labels Oct 23, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Oct 23, 2023
@margamanterola
Copy link
Contributor

I have no opinions on the code, but regarding the release notes, remember that the readers of the release notes are our users, that don't necessarily understand the inner workings of Cilium. Particularly since this is tagged "bug", it's important to let users know what kind of bug is getting fixed and when they would be affected by it.

I suggest something like:
Avoid "Missed tail call" errors on upgrades and downgrades by replacing the cilium_call_* tail call map instead of modifying the existing one.

@ti-mo
Copy link
Contributor Author

ti-mo commented Oct 23, 2023

I suggest something like: Avoid "Missed tail call" errors on upgrades and downgrades by replacing the cilium_call_* tail call map instead of modifying the existing one.

Good idea, thanks 👍

@ti-mo ti-mo force-pushed the tb/always-migrate-cilium-calls branch 2 times, most recently from 143cada to 0e7762c Compare October 24, 2023 08:55
@ti-mo
Copy link
Contributor Author

ti-mo commented Oct 24, 2023

/test

@ti-mo ti-mo force-pushed the tb/always-migrate-cilium-calls branch from 0e7762c to 66287a3 Compare October 25, 2023 11:51
@ti-mo
Copy link
Contributor Author

ti-mo commented Oct 25, 2023

/test

@ti-mo ti-mo force-pushed the tb/always-migrate-cilium-calls branch from 66287a3 to b0985dc Compare October 25, 2023 12:46
@ti-mo
Copy link
Contributor Author

ti-mo commented Oct 25, 2023

/test

@ti-mo ti-mo force-pushed the tb/always-migrate-cilium-calls branch from b0985dc to 8734d8b Compare October 26, 2023 11:40
@ti-mo
Copy link
Contributor Author

ti-mo commented Oct 26, 2023

/test

See code comments, as the change is subtle. This sets things up for subsequent
commits. As a minor benefit, make the code slightly less unreadable.

Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo
Copy link
Contributor Author

ti-mo commented Oct 26, 2023

/test

@ti-mo ti-mo force-pushed the tb/always-migrate-cilium-calls branch from 8734d8b to c66d06a Compare October 26, 2023 12:27
@ti-mo ti-mo marked this pull request as ready for review October 26, 2023 12:27
@ti-mo ti-mo requested a review from a team as a code owner October 26, 2023 12:27
@ti-mo ti-mo requested a review from rgo3 October 26, 2023 12:27
Note: this is obviously a hack. There's a long-standing issue with Cilium's
loader infrastructure where program arrays like cilium_calls_* are reused if
the map's properties in the ELF are compatible with the pinned version present
on the system. This has been the case since the early iproute2 days, and
ebpf-go has the same behaviour by default. This is unsound, but doesn't always
cause issues in practice.

However, as Cilium's eBPF code base grew, along with its contributor base,
increasing code churn and the need to backport fixes, this method started
falling apart. Pinned tail call maps are actively used for handling traffic,
and should be considered fixed, as they are an integral part of the program's
control flow. When another BPF object needs to be attached to a kernel hook,
it's incorrect to replace entries in an existing program array used by a prior
version of the BPF object, since logic in some tail calls may have been enabled
or disabled in the new one, or may have been removed altogether.

A prog array's contents cannot be replaced atomically, meaning that while the
prog array is being repopulated by a new ELF, a packet may be processed by a
chain of programs that are partially old and partially new. This is bad.

This patch fixes the specific case of cilium_calls_*, since it's the only
prog array across all ELFs that consistently acts as the central tail call map.
The main motivation for this approach is backportability as far back as 1.12,
since we're seeing increased connectivity interruptions (leading to test flakes)
during up/downgrades and restarts in that branch. This is caused by other
backports creating code churn, as well as the stark difference in the contents
of tail calls between major Cilium versions.

This code will be replaced by a 'commit' system that does away with the re-
pinning behaviour and keeps new map fd's in memory; only committing them to
bpffs when the necessary endpoint prog(s) were successfully replaced.

Fixes: cilium#20691

Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo ti-mo force-pushed the tb/always-migrate-cilium-calls branch from c66d06a to a137ea6 Compare October 26, 2023 14:34
@ti-mo
Copy link
Contributor Author

ti-mo commented Oct 26, 2023

/test

Copy link
Contributor

@rgo3 rgo3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for the code walk :)

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 26, 2023
@ti-mo ti-mo merged commit 94ec45a into cilium:main Oct 26, 2023
@ti-mo ti-mo deleted the tb/always-migrate-cilium-calls branch October 27, 2023 08:38
@ti-mo ti-mo added backport-pending/1.13 backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.13 labels Nov 7, 2023
@julianwiedmann julianwiedmann added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.13 backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Nov 17, 2023
@pchaigno
Copy link
Member

Discussed offline with Timo. We won't backport to v1.12 as it's too big of a backport and thus risky. Instead, we'll skip this specific error in v1.12 CI (and v1.13 downgrade tests).

@julianwiedmann julianwiedmann added the affects/v1.12 This issue affects v1.12 branch label Jan 11, 2024
@margamanterola margamanterola removed the affects/v1.12 This issue affects v1.12 branch label Jan 12, 2024
skmatti pushed a commit to skmatti/cilium that referenced this pull request Jul 24, 2024
The following upstream change unconditionally repin the map breaks the
functionality for L2 interfaces of Kubevirt.

cilium#28740

The bpf map for L2 interfaces are manually created in
pkg/datapath/connector/utils.go and remain static, making repinning
unnecessary.

This change make sure the tail calls are not disturbed during graft
process. It should not affect other endpoints since only L2 endpoints
are calling graftDatapath.

BUG: 333792124, 334106213
Change-Id: Icf2a2bf9629d79761782bd3c2ce3c9fa89d5f42f
Reviewed-on: https://gke-internal-review.googlesource.com/c/third_party/cilium/+/990344
Tested-by: Prow_Bot_V2 <425329972751-compute@developer.gserviceaccount.com>
Reviewed-by: Zang Li <zangli@google.com>
Lint: Lint 🤖 <android-build-ayeaye@system.gserviceaccount.com>
Reviewed-by: Prow_Bot_V2 <425329972751-compute@developer.gserviceaccount.com>
Reviewed-by: Sugang Li <sugangli@google.com>
Unit-Verified: Prow_Bot_V2 <425329972751-compute@developer.gserviceaccount.com>
skmatti pushed a commit to skmatti/cilium that referenced this pull request Jul 24, 2024
There was an upstream PR (cilium#28740)
which changed the way tail call map re-pin works during datapath reload.
As part of this they updated the interface to the code that initiates
reload, so that instead of calling reloadDatapath() twice for each
interface (once per direction), the new code now requires calling
reloadDatapath() once and have the single call handle both directions.

Our multi-nic datapath reloader was using the old model so was exposed
to tail call map issues now that the underlying function implementation
has changed.

This change updates the google_loader code to match the new upstream
model.

Bug: b/339324159, b/338122169

Change-Id: I54b0324f1be61f07cc4b2e2d69232f0c9bceb584
Reviewed-on: https://gke-internal-review.googlesource.com/c/third_party/cilium/+/1018651
Lint: Lint 🤖 <android-build-ayeaye@system.gserviceaccount.com>
Reviewed-by: Sugang Li <sugangli@google.com>
Reviewed-by: Yifeng Shen <yfshen@google.com>
Unit-Verified: Prow_Bot_V2 <425329972751-compute@developer.gserviceaccount.com>
Tested-by: Prow_Bot_V2 <425329972751-compute@developer.gserviceaccount.com>
Reviewed-by: Prow_Bot_V2 <425329972751-compute@developer.gserviceaccount.com>
Reviewed-by: Maciej Skrocki <maciejskrocki@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/author The backport will be carried out by the author of the PR. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Always migrate tail call maps
6 participants