Skip to content

Conversation

ti-mo
Copy link
Contributor

@ti-mo ti-mo commented Jun 11, 2024

Fixes: #29333
Fixes: #29334

Addressing both these issues in a single PR since they're relatively co-dependent.
See individual commits for details.

Remove bpf map migration mechanism to minimize bpf file system operations during endpoint regeneration

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 11, 2024
@ti-mo ti-mo changed the title Tb/map pin replace bpf: remove map migrations, add custom map flag Jun 11, 2024
@ti-mo ti-mo force-pushed the tb/map-pin-replace branch from 9277ae8 to aadfe1b Compare June 11, 2024 16:03
@ti-mo ti-mo added the release-note/misc This PR makes changes that have no direct user impact. label Jun 11, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 11, 2024
@ti-mo ti-mo added area/loader Impacts the loading of BPF programs into the kernel. area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Jun 11, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Jun 11, 2024
@ti-mo
Copy link
Contributor Author

ti-mo commented Jun 11, 2024

/test

@ti-mo ti-mo force-pushed the tb/map-pin-replace branch from aadfe1b to 002f195 Compare June 11, 2024 16:18
@ti-mo
Copy link
Contributor Author

ti-mo commented Jun 11, 2024

/test

@ti-mo ti-mo force-pushed the tb/map-pin-replace branch 2 times, most recently from f4bbf92 to 8da6247 Compare June 12, 2024 13:13
@ti-mo ti-mo marked this pull request as ready for review June 12, 2024 13:13
@ti-mo ti-mo requested review from a team as code owners June 12, 2024 13:13
@ti-mo ti-mo requested review from rgo3, nebril and gentoo-root June 12, 2024 13:13
@ti-mo
Copy link
Contributor Author

ti-mo commented Jun 12, 2024

/test

Copy link
Contributor

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup!

@ti-mo ti-mo marked this pull request as draft June 13, 2024 08:14
@ti-mo
Copy link
Contributor Author

ti-mo commented Jun 13, 2024

/ci-verifier

@ti-mo ti-mo force-pushed the tb/map-pin-replace branch from 74d2c49 to 6236b44 Compare June 13, 2024 09:31
ti-mo added 3 commits June 13, 2024 11:32
Treat the bpf_elf_map.pinning field just like BTF map definitions do,
and replace PIN_GLOBAL_NS with LIBBPF_PIN_BY_NAME. Pass the values through
directly when parsing MapSpec.Extra on the Go side.

A future commit will assign meaning to a value higher than LIBBPF_PIN_BY_NAME.

Signed-off-by: Timo Beckers <timo@isovalent.com>
At its inception, Cilium had an external ebpf loader (iproute2) that didn't
deal with changes to map properties (type/k/v/maxentries/flags). To allow
the agent to upgrade/downgrade maps, a 'map migration' system was introduced
that would take the new ELF and compare its maps against their pinned
counterparts on the system's bpffs. Incompatible maps were renamed using
a ':pending' suffix to allow the loader to create and pin a new instance of
the map at its old path. If all went well, the :pending map was removed.

Even though it served us for many years, this system wasn't without its
drawbacks, primarily the many moving parts (files) to manage on bpffs, as well
as its obscuring of subtle bugs in managing tail call map lifecycle.

This commit replaces the map migration system with a commit-based system that
doesn't modify any bpffs-related resources until all of an ELF's entrypoints
are attached and all cross-ELF tail calls (policy progs) have been inserted.

After commit() has run for a Collection, only one copy of each map pin will be
present on bpffs. This removes all possibility of previous ELF generations
being partially attached somewhere, still handling traffic using an old tail
call map. Such cases will now fail loudly with the 'missed tail call' metric
increasing due to the old tail call map pins being removed.

Signed-off-by: Timo Beckers <timo@isovalent.com>
This commit adds support for the custom CILIUM_PIN_REPLACE pinning flag.
It signals to the loader that a map should be pinned without being reused
in subsequent ELF loads.

This replaces bespoke cilium_calls_-specific logic in the loader with a
generic map flag, opening it up for other use cases as well. The reasons
for this behaviour are widely documented and present in the code for
posterity.

Also, give each netdev its own instance of cilium_calls. Sharing a tail call
map across all XDP entry points causes multiple netdevs' programs to clobber
the shared cilium_calls_xdp bpffs pin.

Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo ti-mo force-pushed the tb/map-pin-replace branch from 6236b44 to c0b9b7c Compare June 13, 2024 09:32
@ti-mo ti-mo marked this pull request as ready for review June 13, 2024 09:32
@ti-mo
Copy link
Contributor Author

ti-mo commented Jun 13, 2024

/test

Copy link
Member

@nebril nebril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ci-structure LGTM

Copy link
Contributor

@rgo3 rgo3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change! One nit, but approving because it only impacts readability AFAICT.

@ti-mo ti-mo enabled auto-merge June 13, 2024 11:34
@ti-mo ti-mo added this pull request to the merge queue Jun 13, 2024
Merged via the queue into cilium:main with commit 2fd9150 Jun 13, 2024
@ti-mo ti-mo deleted the tb/map-pin-replace branch June 13, 2024 14:05
@ti-mo ti-mo added the backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. label Nov 21, 2024
@giorio94
Copy link
Member

Updating the backport labels, as partial backports have been merged into v1.14 and v1.15.

@giorio94 giorio94 added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. area/loader Impacts the loading of BPF programs into the kernel. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add custom BPF map pinning flag for 'always-recreated' maps Remove 'map migration' mechanism
6 participants