-
Notifications
You must be signed in to change notification settings - Fork 768
btf: lazy decoding #1763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
btf: lazy decoding #1763
Conversation
@brb could you try this in pwru and see how bad things are? |
Signed-off-by: Martynas Pumputis <m@lambda.lt>
@lmb Are you interested in perf benchmarks, or correctness? If the latter, then all good - cilium/pwru#548. Unfortunately, we do not have any benchmarking suites. |
def29e7
to
afa4ee7
Compare
I added a crude benchmark which just iterates all of vmlinux to simulate pwru. Even that has gotten faster! |
3ba613d
to
f9c4803
Compare
Add a benchmark which replicates the types used by Inspektor Gadget for a common configuration. Also add a benchmark which explicitly iterates all types in vmlinux, which is similar to what pwru does. See cilium#1755 (comment) Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Stop relying on readAndInflateTypes by creating a full BTF blob in the test. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
mutableTypes is going to be removed when switching to lazy BTF parsing. Move the implementation of AnyTypesByName to Spec. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Lazy decoding of BTF means that iterating over a Spec may produce errors. The current iterator interface doesn't allow returning errors. Replace Iterate() with a real iterator which calls into Spec.TypeByID instead of going via an internal fast path. This also allows moving Datasec fixups in the future. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Some Datasec are not correctly populated when parsing BTF from an ELF file (vs. raw vmlinux BTF). To address this we currently do a full pass over all types to find Datasec that need fixing up. Instead, perform the fixup when retrieving a Type from a Spec. Spec loaded from a raw BTF blob are kept unchanged. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Move the implementation of readAndInflateTypes into a new file. No other changes intended. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Switch BTF decoding from operating on a reader to operating on a byte slice. This simplifies the code and removes the need for a bunch of intermediate buffers. It's also necessary to be able to lazily decode BTF in the future. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Break up readAndInflateTypes into a dedicated decoder type. Constructing the decoder does a single pass over the BTF which collects the offsets of all types in the blob. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Refactor the decoder so that it is possible to unmarshal one type at a time. The main challenges are: * Unmarshaling a type requires unmarshaling all descendants. This is solved by recursively inflating types until reaching a leaf. Care needs to be taken to avoid infinite recursion when dealing with self-referential types like linked list headers. The trick here is that we store a partially inflated Type in d.types so that later function calls return the cached value. The partially inflated Type is removed from d.types on error to avoid polluting the decoder with "broken" types. * Unmarshaling a type may require decoding declTags pointing at that type. This is handled by maintaining a lookaside index which maps type IDs to declTag typeIDs. * Unmarshaling a legacy bitfield also requires a lookaside data structure. Even though the decoder is now capable of lazy decoding we still construct a full []Type when parsing BTF to keep the diff manageable. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Use the lazy decoder to avoid decoding all types upfront. This requires adding an index from essentialName to TypeID to be able to implement AnyTypeByName. decoder has to be safe for concurrent use because we want to share the raw BTF blob and so on when copying a Spec. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
It turns out that doing two passes over the encoded BTF is basically free. Use this approach to pre-calculate the correct size for the data structures required by the decoder. This reduces allocations and copying by a substantial amount. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
We still construct full slices of various raw BTF types, only to discard them afterwards. Instead, decode only a single raw type and accumulate the inflated equivalents. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Appending to a nil slice allocates a slice with an 8 item capacity as of Go 1.24. This is wasteful since most entries will only ever need a single ID. Do some syntax gymnastics to avoid this waste. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
f9c4803
to
ef22608
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Prompted by #1755 another stab at lazy decoding of BTF types. IterateVmlinux is roughly the pwru workload (IIRC) and InspectorGadget does what it says on the tin.