Skip to content

Conversation

ti-mo
Copy link
Collaborator

@ti-mo ti-mo commented Jun 2, 2025

This PR addresses some performance concerns around opening many BPF objects from fd/id/pin, such as when iterating all objects on the system or walking bpffs. I've introduced a few benchmarks targeting the worst offenders (string allocations in fdinfo scanning and slice allocs in new{Map,Program}Info*.

  • Initially, the idea was to cache MapInfo and ProgramInfo, but this was scrapped in favor of making the initial object opening as cheap as possible.
  • Program.Stats() was added, lifting highly-variable statistics fields out of ProgramInfo, which is otherwise prohibitively expensive to call repeatedly for gathering metrics. ProgramInfo.RunCount, .Runtime and .RecursionMisses were moved to ProgramStats.
  • This raises the minimum required Linux version for opening objects from 4.10 to 4.13, a worthwhile trade-off since 4.13 has been EOL since Nov 2017.

Total gains made by this optimization pass:

goos: linux
goarch: amd64
pkg: github.com/cilium/ebpf
cpu: AMD Ryzen 7 3700X 8-Core Processor
                    │    old.txt     │              new.txt               │
                    │     sec/op     │   sec/op     vs base               │
NewMapFromFD-16       15038.0n ± 20%   674.0n ± 0%  -95.52% (p=0.002 n=6)
MapInfo-16              14.78µ ±  0%   13.79µ ± 1%   -6.70% (p=0.002 n=6)
NewProgramFromFD-16    16.449µ ±  1%   1.139µ ± 1%  -93.08% (p=0.002 n=6)
ProgramInfo-16          16.42µ ±  1%   15.32µ ± 0%   -6.69% (p=0.002 n=6)
ScanFdInfoReader-16     2.085µ ±  5%   1.309µ ± 5%  -37.19% (p=0.002 n=6)
Stats-16                394.0µ ±  1%   347.5µ ± 7%  -11.82% (p=0.002 n=6)
geomean                 19.15µ         6.477µ       -66.18%

                    │   old.txt    │               new.txt               │
                    │     B/op     │     B/op      vs base               │
NewMapFromFD-16        5992.0 ± 0%     256.0 ± 0%  -95.73% (p=0.002 n=6)
MapInfo-16            5.773Ki ± 0%   5.198Ki ± 0%   -9.96% (p=0.002 n=6)
NewProgramFromFD-16    6048.0 ± 0%     628.0 ± 0%  -89.62% (p=0.002 n=6)
ProgramInfo-16        5.844Ki ± 0%   5.179Ki ± 0%  -11.38% (p=0.002 n=6)
ScanFdInfoReader-16   4.534Ki ± 0%   4.071Ki ± 0%  -10.21% (p=0.002 n=6)
Stats-16                             3.279Ki ± 0%
geomean               5.554Ki        1.951Ki       -68.35%

                    │   old.txt   │              new.txt              │
                    │  allocs/op  │ allocs/op   vs base               │
NewMapFromFD-16       48.000 ± 0%   3.000 ± 0%  -93.75% (p=0.002 n=6)
MapInfo-16             47.00 ± 0%   23.00 ± 0%  -51.06% (p=0.002 n=6)
NewProgramFromFD-16   49.000 ± 0%   4.000 ± 0%  -91.84% (p=0.002 n=6)
ProgramInfo-16         48.00 ± 0%   22.00 ± 0%  -54.17% (p=0.002 n=6)
ScanFdInfoReader-16   24.000 ± 0%   5.000 ± 0%  -79.17% (p=0.002 n=6)
Stats-16                            64.00 ± 0%
geomean                41.78        11.17       -81.14%

In terms of real-world performance in Cilium, this is a benchmark creating 1000 maps and 2000 programs, then iterating them all, opening them as *Map and *Program, filtering by name, then calling .Info():

goos: linux
goarch: amd64
pkg: github.com/cilium/cilium/pkg/metrics
cpu: AMD Ryzen 7 3700X 8-Core Processor
               │   old.txt    │               new.txt               │
               │    sec/op    │    sec/op     vs base               │
GetBPFUsage-16   144.50m ± 1%   90.43m ± 11%  -37.42% (p=0.002 n=6)

               │   old.txt    │               new.txt               │
               │     B/op     │     B/op      vs base               │
GetBPFUsage-16   50.38Mi ± 0%   24.71Mi ± 0%  -50.96% (p=0.002 n=6)

               │   old.txt   │              new.txt               │
               │  allocs/op  │  allocs/op   vs base               │
GetBPFUsage-16   358.1k ± 0%   104.7k ± 0%  -70.75% (p=0.002 n=6)

This code is executed on each call to the /metrics endpoint and is by far the largest contributor, shaving off roughly another 40% in terms of time spent blocking the scrape.

@github-actions github-actions bot added the breaking-change Changes exported API label Jun 2, 2025
@ti-mo ti-mo force-pushed the tb/cache-info branch 3 times, most recently from 4dbd574 to 6dd2a24 Compare June 2, 2025 15:08
@ti-mo ti-mo changed the title Cache ProgramInfo and MapInfo Optimize New{Map,Program}From{ID,FD} and LoadPinned{Map,Program} Jun 3, 2025
@ti-mo ti-mo marked this pull request as ready for review June 3, 2025 11:35
@ti-mo ti-mo requested a review from a team as a code owner June 3, 2025 11:35
@ti-mo ti-mo force-pushed the tb/cache-info branch 2 times, most recently from 29801e3 to 1237ae6 Compare June 3, 2025 14:06
ti-mo added 2 commits June 3, 2025 16:16
Previously, we didn't really have a test that covers all of Map.Info(), only the
procfs-based fallback. This patch fixes that and requires fdinfo to be present
from now on. This was added to the kernel in 4.9.

Signed-off-by: Timo Beckers <timo@isovalent.com>
A follow-up commit will aim to improve the performance and reduce allocations
of the fdinfo reader.

Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo ti-mo force-pushed the tb/cache-info branch 2 times, most recently from 9b3d99c to d52c5bc Compare June 3, 2025 14:17
ti-mo added 3 commits June 3, 2025 16:03
core: 1
goos: linux
goarch: amd64
pkg: github.com/cilium/ebpf
cpu: 13th Gen Intel(R) Core(TM) i7-1365U
                 │  base.txt   │              opt.txt               │
                 │   sec/op    │   sec/op     vs base               │
ScanFdInfoReader   3.212µ ± 1%   1.983µ ± 1%  -38.25% (p=0.002 n=6)

                 │   base.txt   │               opt.txt               │
                 │     B/op     │     B/op      vs base               │
ScanFdInfoReader   4.531Ki ± 0%   4.050Ki ± 0%  -10.62% (p=0.002 n=6)

                 │  base.txt   │              opt.txt              │
                 │  allocs/op  │ allocs/op   vs base               │
ScanFdInfoReader   24.000 ± 0%   3.000 ± 0%  -87.50% (p=0.002 n=6)

Signed-off-by: Timo Beckers <timo@isovalent.com>
This commit lifts runtime statistics out of ProgramInfo to allow them to be
queried without fetching all of ProgramInfo, which would otherwise require
multiple calls to OBJ_INFO, multiple allocations, as well as parsing fdinfo.

Signed-off-by: Timo Beckers <timo@isovalent.com>
Over the years, a lot of code was added to new{Map,Program}InfoFromFd as the
kernel started exposing more object information. However, for the sake of
opening a Map or Program from an fd/id/pin, not much information is needed;
certainly not some of the extended info like the program bytecode and other
arrays of associated object ids.

This commit introduces a 'minimal' version of the object info retrieval to
speed up the process of opening many objects in sequence, like when iterating
them with {Map,Program}GetNextID or walking a bpffs directory.

```
goos: linux
goarch: amd64
pkg: github.com/cilium/ebpf
cpu: AMD Ryzen 7 3700X 8-Core Processor
                    │    old.txt    │              new.txt               │
                    │    sec/op     │   sec/op     vs base               │
NewMapFromFD-16       13934.0n ± 1%   663.4n ± 1%  -95.24% (p=0.002 n=6)
NewProgramFromFD-16    15.403µ ± 1%   1.139µ ± 1%  -92.61% (p=0.002 n=6)
geomean                 16.70µ        6.487µ       -61.16%

                    │   old.txt    │                new.txt                │
                    │     B/op     │     B/op      vs base                 │
NewMapFromFD-16        5403.0 ± 0%     256.0 ± 0%  -95.26% (p=0.002 n=6)
NewProgramFromFD-16    5367.0 ± 0%     628.0 ± 0%  -88.30% (p=0.002 n=6)
geomean               4.637Ki        1.951Ki       -57.93%
¹ all samples are equal

                    │   old.txt   │               new.txt               │
                    │  allocs/op  │ allocs/op   vs base                 │
NewMapFromFD-16       24.000 ± 0%   3.000 ± 0%  -87.50% (p=0.002 n=6)
NewProgramFromFD-16   23.000 ± 0%   4.000 ± 0%  -82.61% (p=0.002 n=6)
geomean                21.14        11.17       -47.17%
¹ all samples are equal
```

Signed-off-by: Timo Beckers <timo@isovalent.com>
@lmb lmb force-pushed the tb/cache-info branch from d52c5bc to 7ad731c Compare June 3, 2025 15:07
@ti-mo ti-mo merged commit df9ebe8 into cilium:main Jun 3, 2025
18 checks passed
@ti-mo ti-mo deleted the tb/cache-info branch June 3, 2025 15:17
ti-mo added a commit to ti-mo/cilium that referenced this pull request Jun 3, 2025
cilium/ebpf#1791 optimized opening bpf objects and calling obj_info,
since it was allocating quite heavily before.

This is the effect on the GetBPFUsage benchmark:

goos: linux
goarch: amd64
pkg: github.com/cilium/cilium/pkg/metrics
cpu: AMD Ryzen 7 3700X 8-Core Processor
               │   old.txt    │              new.txt               │
               │    sec/op    │   sec/op     vs base               │
GetBPFUsage-16   145.43m ± 1%   77.51m ± 1%  -46.70% (p=0.002 n=6)

               │   old.txt    │               new.txt               │
               │     B/op     │     B/op      vs base               │
GetBPFUsage-16   50.38Mi ± 0%   24.24Mi ± 0%  -51.89% (p=0.002 n=6)

               │   old.txt    │              new.txt               │
               │  allocs/op   │  allocs/op   vs base               │
GetBPFUsage-16   358.11k ± 0%   82.67k ± 0%  -76.92% (p=0.002 n=6)

This should make the /metrics endpoint even more responsive and reduce the
amount of garbage it creates.

Signed-off-by: Timo Beckers <timo@isovalent.com>
github-merge-queue bot pushed a commit to cilium/cilium that referenced this pull request Jun 4, 2025
cilium/ebpf#1791 optimized opening bpf objects and calling obj_info,
since it was allocating quite heavily before.

This is the effect on the GetBPFUsage benchmark:

goos: linux
goarch: amd64
pkg: github.com/cilium/cilium/pkg/metrics
cpu: AMD Ryzen 7 3700X 8-Core Processor
               │   old.txt    │              new.txt               │
               │    sec/op    │   sec/op     vs base               │
GetBPFUsage-16   145.43m ± 1%   77.51m ± 1%  -46.70% (p=0.002 n=6)

               │   old.txt    │               new.txt               │
               │     B/op     │     B/op      vs base               │
GetBPFUsage-16   50.38Mi ± 0%   24.24Mi ± 0%  -51.89% (p=0.002 n=6)

               │   old.txt    │              new.txt               │
               │  allocs/op   │  allocs/op   vs base               │
GetBPFUsage-16   358.11k ± 0%   82.67k ± 0%  -76.92% (p=0.002 n=6)

This should make the /metrics endpoint even more responsive and reduce the
amount of garbage it creates.

Signed-off-by: Timo Beckers <timo@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change Changes exported API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants