-
Notifications
You must be signed in to change notification settings - Fork 3.4k
bpf: remove map migrations, add custom map flag #33067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9277ae8
to
aadfe1b
Compare
/test |
aadfe1b
to
002f195
Compare
/test |
f4bbf92
to
8da6247
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice cleanup!
/ci-verifier |
74d2c49
to
6236b44
Compare
Treat the bpf_elf_map.pinning field just like BTF map definitions do, and replace PIN_GLOBAL_NS with LIBBPF_PIN_BY_NAME. Pass the values through directly when parsing MapSpec.Extra on the Go side. A future commit will assign meaning to a value higher than LIBBPF_PIN_BY_NAME. Signed-off-by: Timo Beckers <timo@isovalent.com>
At its inception, Cilium had an external ebpf loader (iproute2) that didn't deal with changes to map properties (type/k/v/maxentries/flags). To allow the agent to upgrade/downgrade maps, a 'map migration' system was introduced that would take the new ELF and compare its maps against their pinned counterparts on the system's bpffs. Incompatible maps were renamed using a ':pending' suffix to allow the loader to create and pin a new instance of the map at its old path. If all went well, the :pending map was removed. Even though it served us for many years, this system wasn't without its drawbacks, primarily the many moving parts (files) to manage on bpffs, as well as its obscuring of subtle bugs in managing tail call map lifecycle. This commit replaces the map migration system with a commit-based system that doesn't modify any bpffs-related resources until all of an ELF's entrypoints are attached and all cross-ELF tail calls (policy progs) have been inserted. After commit() has run for a Collection, only one copy of each map pin will be present on bpffs. This removes all possibility of previous ELF generations being partially attached somewhere, still handling traffic using an old tail call map. Such cases will now fail loudly with the 'missed tail call' metric increasing due to the old tail call map pins being removed. Signed-off-by: Timo Beckers <timo@isovalent.com>
This commit adds support for the custom CILIUM_PIN_REPLACE pinning flag. It signals to the loader that a map should be pinned without being reused in subsequent ELF loads. This replaces bespoke cilium_calls_-specific logic in the loader with a generic map flag, opening it up for other use cases as well. The reasons for this behaviour are widely documented and present in the code for posterity. Also, give each netdev its own instance of cilium_calls. Sharing a tail call map across all XDP entry points causes multiple netdevs' programs to clobber the shared cilium_calls_xdp bpffs pin. Signed-off-by: Timo Beckers <timo@isovalent.com>
6236b44
to
c0b9b7c
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ci-structure LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice change! One nit, but approving because it only impacts readability AFAICT.
Updating the backport labels, as partial backports have been merged into v1.14 and v1.15. |
Fixes: #29333
Fixes: #29334
Addressing both these issues in a single PR since they're relatively co-dependent.
See individual commits for details.