kfunc: poison all weak kfunc calls if the kernel lacks BTF #1774

bogushevich · 2025-05-06T03:32:55Z

KFuncs fixing up loads only system BTF, so using custom (non-system) BTF specification with KFuncs leads to errors.

ti-mo

Thanks!

linker.go

dylandreimerink · 2025-05-06T11:39:54Z

The CI is unhappy with this change. This should take module BTF into account.

The error is fixing up kfuncs: find target in modules: parse types for module bpf_testmod: lookup type name for id 1: offset 1489950 is out of bounds of string table

I think that makes sense if we think about it. Kernel modules use split BTF, they essentially extend the vmlinux the kernel ships with. Strings in the module BTF are encoded as {length of base string table} + offset in module string table. So when we provide a custom BTF for the base in this way, we are unable to understand the module BTF.

I see two ways of approaching this. First, If we are in a situation where a custom BTF is provided, simply do not attempt to use the module BTF. If the user needs type from a module, he has to make sure that the custom BTF contains the types from the module as well.

Second, is we could allow users to provide a set of module BTFs which we would consult instead of the module BTFs from the kernel itself (if any exist). It would be up to the user to ensure they are generated with the provided vmlinux BTF as base. This can be sometime we add later as addition to the first approach.

ti-mo · 2025-05-06T12:38:25Z

Second, is we could allow users to provide a set of module BTFs which we would consult instead of the module BTFs from the kernel itself (if any exist). It would be up to the user to ensure they are generated with the provided vmlinux BTF as base. This can be sometime we add later as addition to the first approach.

To add a data point to this discussion, BTFHub ships a single blob per kernel and seems to disregard kernel modules entirely. (I could be wrong, of course!) libbpf also accepts only a single blob in .btf_custom_path: https://www.mankier.com/8/bpftool-gen#Examples-min_core_btf:

LIBBPF_OPTS(bpf_object_open_opts, opts, .btf_custom_path = "5.4.0-smaller.btf");
struct bpf_object *obj;

obj = bpf_object__open_file("one.bpf.o", &opts);

Purely for doing CO-RE, all kmods could technically be merged into a single blob, but it doesn't seem like the btfhub tooling currently supports that. And since these kernels aren't getting any younger, I doubt there's much incentive to do so.

Can't find the source, but if .btf_custom_path is set, libbpf ignores all other sources of BTF (like kmods). I think that only makes sense.

lmb · 2025-05-06T15:58:15Z

I don't think it's possible to use external kmod BTF for kfuncs in the first place. See the code which adds the module handle to fdArray: that handle needs to point at a kmod BTF, you can't just load external BTF into the kernel.

I forget how kfuncs targeting the kernel work exactly. Is it even possible without having BTF built in? AKA, not sure this change makes sense at all.

ti-mo · 2025-05-07T09:53:18Z

I don't think it's possible to use external kmod BTF for kfuncs in the first place. See the code which adds the module handle to fdArray: that handle needs to point at a kmod BTF, you can't just load external BTF into the kernel.

I forget how kfuncs targeting the kernel work exactly. Is it even possible without having BTF built in? AKA, not sure this change makes sense at all.

Good points! I assume it may be possible to invoke vmlinux kfuncs using external BTF if a distro/admin decided to ship BTF separately instead of in /sys/kernel/btf as a 'security' measure? Iirc kfuncs are referred to by btf id. We probably shouldn't take kmod BTF into account during resolving in any case.

lmb · 2025-05-07T10:19:04Z

Iirc kfuncs are referred to by btf id.

Exactly, so how does the kernel figure out what to attach to if it doesn't have BTF included? It also needs to do the id -> kfunc lookup, no? And if it needs BTF included, why is it not available in /sys/kernel/btf?

@bogushevich can you tell us a bit more when / how this is useful / necessary?

dylandreimerink · 2025-05-07T10:40:35Z

Exactly, so how does the kernel figure out what to attach to if it doesn't have BTF included? It also needs to do the id -> kfunc lookup, no? And if it needs BTF included, why is it not available in /sys/kernel/btf?

ACK. From what I understand from the kconfigs and header files is that when compiling with BTF, the kfunc BTF ids are encoded in the ".BTF_ids" ELF section of the kernel, which are then read at runtime to know which ids are kfuncs with what flags. If you don't compile the kernel with BTF, these are not included, thus no kfuncs. Even if you were to create BTF from the DWARF like those on BTFHub.

lmb · 2025-05-07T10:55:31Z

@bogushevich how about we document this on fixupKfuncs so that future generations don't go down the same rabbit hole?

linker.go

bogushevich · 2025-05-12T05:32:42Z

Sorry, it was my mistake. You are right, after this commit my program is loaded successfully, but doesn't work properly.
My code is like this:

volatile const __u8 conntrack_enabled = 0;

extern struct nf_conn *bpf_skb_ct_lookup(
    struct __sk_buff *,
    struct bpf_sock_tuple *, u32,
    struct bpf_ct_opts___local *, u32
) __ksym __weak;

if (conntrack_enabled) {
   struct nf_conn *ct = bpf_skb_ct_lookup(/*...*/)
   if (ct) {/*...*/}
}

So if flag conntrack_enabled is not set from user space, program is successfully loaded even on kernels that don’t support necessary kfuncs. But if there is no system BTF, program fallback to custom BTF and loading is failed on fixupKfuncs step regardless of flag value.
Unfortunately some modern Linux distributions still don’t enable BTF for some reason. I just dont’t want to have two similar BPF program: one with conntrack support and one without.

ti-mo · 2025-05-12T08:33:21Z

So if flag conntrack_enabled is not set from user space, program is successfully loaded even on kernels that don’t support necessary kfuncs. But if there is no system BTF, program fallback to custom BTF and loading is failed on fixupKfuncs step regardless of flag value. Unfortunately some modern Linux distributions still don’t enable BTF for some reason. I just dont’t want to have two similar BPF program: one with conntrack support and one without.

I think this makes a lot of sense to support. @dylandreimerink @lmb Should we treat an error from btf.LoadKernelSpec in fixupKfuncs as recoverable and poison all weak kfunc calls? IMO that's minimally invasive.

dylandreimerink · 2025-05-12T09:15:22Z

Unfortunately some modern Linux distributions still don’t enable BTF for some reason. I just dont’t want to have two similar BPF program: one with conntrack support and one without.

I think this makes a lot of sense to support. @dylandreimerink @lmb Should we treat an error from btf.LoadKernelSpec in fixupKfuncs as recoverable and poison all weak kfunc calls? IMO that's minimally invasive.

I think so. I suggest we tolerate the ErrNotSupported error, then short-circuit findTargetInKernel. So when no system BTF is found we essentially act as if kfuncs can never be found. This allows use cases such as these to mark their kfunc as
__weak and allow you to write code that checks if the kfunc is available and fall back if not using CO-RE. If we find strong kfuncs, we will throw the usual error.

extern struct nf_conn *bpf_skb_ct_lookup(
   struct __sk_buff *,
   struct bpf_sock_tuple *, u32,
   struct bpf_ct_opts___local *, u32
) __ksym __weak;

if (bpf_ksym_exists(bpf_skb_ct_lookup)) {
  struct nf_conn *ct = bpf_skb_ct_lookup(/*...*/)
  if (ct) {/*...*/}
}

In order to enable portably running a single BPF object with kfuncs on kernels with and without BTF, fixupKfuncs needs to tolerate BTF not being available. This commit poisons all kfunc calls if (a) they were marked weak and (b) kernel BTF is not available. When guarded with the appropriate CO-RE conditions, this should make it possible to run BPF objects with kfunc calls on kernels without BTF. Signed-off-by: Timo Beckers <timo@isovalent.com> Reported-by: Maxim Bogushevich <boga1@mail.ru>

bogushevich requested a review from a team as a code owner May 6, 2025 03:32

ti-mo approved these changes May 6, 2025

View reviewed changes

chenhengqi reviewed May 6, 2025

View reviewed changes

linker.go Outdated Show resolved Hide resolved

ti-mo reviewed May 7, 2025

View reviewed changes

linker.go Outdated Show resolved Hide resolved

bogushevich requested review from ti-mo and chenhengqi May 15, 2025 03:05

ti-mo force-pushed the fix-kfuncs-custom-btf branch from 2a587ab to d9170bd Compare May 20, 2025 15:28

ti-mo changed the title ~~Fix using KFuncs with non-system BTF~~ kfunc: poison all weak kfunc calls if the kernel lacks BTF May 20, 2025

ti-mo merged commit 5708109 into cilium:main May 20, 2025
17 of 18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

kfunc: poison all weak kfunc calls if the kernel lacks BTF #1774

kfunc: poison all weak kfunc calls if the kernel lacks BTF #1774

Uh oh!

bogushevich commented May 6, 2025

Uh oh!

ti-mo left a comment

Uh oh!

Uh oh!

dylandreimerink commented May 6, 2025

Uh oh!

ti-mo commented May 6, 2025

Uh oh!

lmb commented May 6, 2025

Uh oh!

ti-mo commented May 7, 2025

Uh oh!

lmb commented May 7, 2025

Uh oh!

dylandreimerink commented May 7, 2025 •

edited

Loading

Uh oh!

lmb commented May 7, 2025

Uh oh!

Uh oh!

bogushevich commented May 12, 2025

Uh oh!

ti-mo commented May 12, 2025

Uh oh!

dylandreimerink commented May 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

kfunc: poison all weak kfunc calls if the kernel lacks BTF #1774

kfunc: poison all weak kfunc calls if the kernel lacks BTF #1774

Uh oh!

Conversation

bogushevich commented May 6, 2025

Uh oh!

ti-mo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dylandreimerink commented May 6, 2025

Uh oh!

ti-mo commented May 6, 2025

Uh oh!

lmb commented May 6, 2025

Uh oh!

ti-mo commented May 7, 2025

Uh oh!

lmb commented May 7, 2025

Uh oh!

dylandreimerink commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lmb commented May 7, 2025

Uh oh!

Uh oh!

bogushevich commented May 12, 2025

Uh oh!

ti-mo commented May 12, 2025

Uh oh!

dylandreimerink commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dylandreimerink commented May 7, 2025 •

edited

Loading

dylandreimerink commented May 12, 2025 •

edited

Loading