[RFC] bpf: Introduce `BENCH` test directive for BPF micro benchmarks #40759

jrife · 2025-07-27T20:07:11Z

Recently, I added the BENCH probe type to bpftrace (bpftrace/bpftrace#4343) for implementing BPF micro benchmarks. I thought that something similar could be handy in Cilium as an extension to the BPF unit test framework for benchmarking various programs, gauging the performance effects of BPF changes, etc. and wondered if others would find it useful. Micro benchmarks are never the full picture when it comes to performance, but still, it could be a nice tool.

So,

Introduce BENCH for implementing BPF micro benchmarks. bpftest (make run_bpf_tests) ignores BENCH programs; they are only executed when running bpfbench (make run_bpf_benchmarks). If a PKTGEN program is
defined, it is run before the BENCH program to generate its input, but SETUP and CHECK programs are ignored.

jordan@t14:~/code/cilium$ make run_bpf_benchmarks
DOCKER_ARGS=--privileged RUN_AS_ROOT=1 contrib/scripts/builder.sh \
	make  -j16 -C bpf/tests/ run_bench \
		"BPF_TEST_FILE="""
make: Entering directory '/go/src/github.com/cilium/cilium/bpf/tests'
go run ./bpfbench \
	-bpf-test-path /go/src/github.com/cilium/cilium/bpf/tests \

tc_egressgw_redirect_bench_example	 1000000	      1328 ns/op
make: Leaving directory '/go/src/github.com/cilium/cilium/bpf/tests'

As part of this change, refactor bpf_test.go a bit to extract the useful bits and make them reusable to bpfbench. I considered just putting everything inside bpf_test.go, maybe piggybacking on top of Go's Benchmark*s somehow, but found this approach unworkable. This lead me down the path of implementing benchmarking as a separate tool and making some of the functionality shared between bpftest and bpfbench. I would have liked to use tparse for output formatting, but at present it doesn't support reporting for benchmarks, so the output is currently quite raw.(Coincidentally, mfridman/tparse#67 was reported by @joestringer a few years ago, but alas, it has not yet been implemented)

Prepare for further refactoring by cleaning up the test structure validation logic. First, simplify the logic a bit to get rid of the secondary regex match inside checkProgExists and instead just defer the existence check for progs.checkProg until after the first pass. Just Ensure uniqueness of each setup/pktgen/check program along the way with checkProgUnique. Before, the logic would not catch if a check is not defined for a test until later when a panic occurs, since progs.checkProg would be nil, but now existence of a CHECK program is enforced. Don't rely on the program name at all, only the section names. Before, weird edge cases could occur with strings.Replace() if the test program name itself had the words "setup", "pktgen", or "check" in them. This change uncovered the case with nat4_port_allocation where the test name in the CHECK for the nat4_port_allocation_tcp and nat4_port_allocation_udp tests was wrong, so fix this. Signed-off-by: Jordan Rife <jrife@google.com>

Prepare for the next patch by moving some of the logic from bpf_test.go into helpers.go and exporting it as a set of reusable functions and types: * ForEachSuite allows you to iterate over all BPF unit test collections one by one, transparently handling file reads and BPF collection preprocessing. * CollectionToSuite converts a CollectionSpec and Collection for a BPF test into a set of Test structs. It performs all the validation on the test program structure to ensure the existence and uniqueness of setup/pkg/check programs for each test. * Test contains all the logic for executing a BPF unit test, including the execution of its setup, pktgen, and check programs. Signed-off-by: Jordan Rife <jrife@google.com>

Introduce BENCH for implementing BPF micro benchmarks. bpftest (make run_bpf_tests) ignores BENCH programs; they are only executed when running bpfbench (make run_bpf_benchmarks). If a PKTGEN program is defined, it is run before the BENCH program to generate its input, but SETUP and CHECK programs are ignored. ``` jordan@t14:~/code/cilium$ make run_bpf_benchmarks DOCKER_ARGS=--privileged RUN_AS_ROOT=1 contrib/scripts/builder.sh \ make -j16 -C bpf/tests/ run_bench \ "BPF_TEST_FILE=""" make: Entering directory '/go/src/github.com/cilium/cilium/bpf/tests' go run ./bpfbench \ -bpf-test-path /go/src/github.com/cilium/cilium/bpf/tests \ tc_egressgw_redirect_bench_example 1000000 1328 ns/op make: Leaving directory '/go/src/github.com/cilium/cilium/bpf/tests' ``` Signed-off-by: Jordan Rife <jrife@google.com>

joestringer

Thanks for the PR! Overall LGTM, seems like a nice improvement to have better infrastructure for this use case. I had some super minor nits below, nice-to-haves for consistency. The PR will need a rebase anyway.

joestringer · 2025-08-04T18:20:40Z

Makefile

@@ -554,6 +554,11 @@ run_bpf_tests: ## Build and run the BPF unit tests using the cilium-builder cont
 			"JUNIT_PATH=$(JUNIT_PATH)" \
 			"V=$(BPF_TEST_VERBOSE)"

+run_bpf_benchmarks: ## Build and run the BPF benchmarks tests using the cilium-builder container image.


nit: bench-bpf to follow the existing bench prefix for benchmark targets.

Sure. I'll also move the target up there to group all the bench stuff together.

joestringer · 2025-08-04T18:21:14Z

bpf/tests/Makefile

@@ -50,6 +50,7 @@ clean:
 	rm -f $(wildcard *.d)

 BPF_TEST_FLAGS:= $(GO_TEST_FLAGS)
+BPF_BENCH_FLAGS:=


Can we reuse BENCHFLAGS instead or as a basis for this variable? We could move the definition from Makefile to Makefile.defs for better sharing across different makefiles.

Not sure, but I'll check.

joestringer · 2025-08-04T18:24:52Z

bpf/tests/bpftest/helpers.go

+	data := make([]byte, 4096-256-320)
+
+	// ctx is only used for tc programs
+	// non-empty ctx passed to non-tc programs will cause error: invalid argument
+	ctx := make([]byte, 0)
+	if tt.benchProg.Type() == ebpf.SchedCLS {
+		// sizeof(struct __sk_buff) < 256, let's make it 256
+		ctx = make([]byte, 256)


There's a few magic constants here that seem like they could be better documented through declaration as vars with commetns that reference the corresponding Linux kernel source to explain why these are the right numbers (or hint to someone who has problems how to adjust them).

(realizing upon subsequent review these magic constants were already here >.<)

joestringer · 2025-08-04T18:31:07Z

bpf/tests/bpftest/bpf_test.go

This second commit seems to be doing a little bit more than a straight refactor, which makes it harder to review. Ideally a commit should be strictly moving code around or changing the logic of the code so we can better evaluate the functional changes and not lose them amongst 800 other lines of code. That said at least so far it seems pretty similar with some minor improvements.

Yeah looks like in the end there was a log statement that didn't get copied over and a couple of minor function signature changes which had the same logic just expressed in a better way. Overall the commit makes sense.

This second commit seems to be doing a little bit more than a straight refactor, which makes it harder to review

I'll break this up into some more granular transformations to make it clearer. Probably something like,

commit n: Create the skaffolding to migrate some chunk of code from A to B
commit n+1: Move chunk of code from A to B
... rinse and repeat until the state of the codebase matches what is currently all done in commit 2.

I agree the transformations should be dead simple to understand at a glance; I wanted to gauge general interest first before doing git surgery on this bit.

Makes sense yeah. At least to me this seems helpful, but I'm not actively pursuing datapath benchmarking. If someone else in @cilium/sig-datapath is, I'd encourage @cilium/sig-datapath to try this out and provide feedback.

joestringer · 2025-08-04T18:32:28Z

bpf/tests/bpftest/bpf_test.go

-			continue
-		}
-
-		t.Logf("Skipping program '%s' of type '%s': BPF_PROG_RUN not supported", p.Name, p.Type)


Should we still log this to provide some breadcrumbs for people running the tests as to why some programs are skipped? Just concerned that people might draw the wrong conclusions from the test runner silently skipping some progs.

jrife · 2025-08-06T01:23:53Z

Stray thought, a logical continuation of this work would seem to be allowing runtime program stats to be gathered on demand and exported as metrics (if this isn't already done somewhere in Cilium). I can see this being quite valuable for us in doing profiling of live systems.

github-actions · 2025-09-11T02:03:13Z

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

This comment was marked as resolved.

Sign in to view

maintainer-s-little-helper bot added dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Jul 27, 2025

jrife force-pushed the jrife/bpf-bench branch from bfeacb4 to bab7281 Compare July 27, 2025 21:09

maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Jul 27, 2025

jrife force-pushed the jrife/bpf-bench branch from bab7281 to 637e24b Compare July 27, 2025 21:12

jrife changed the title ~~BENCH for BPF microbenchmarks (WIP)~~ bpf: Introduce BENCH test directive for BPF micro benchmarks Jul 27, 2025

jrife changed the title ~~bpf: Introduce BENCH test directive for BPF micro benchmarks~~ [RFC] bpf: Introduce BENCH test directive for BPF micro benchmarks Jul 27, 2025

jrife force-pushed the jrife/bpf-bench branch from 637e24b to 626a5be Compare July 27, 2025 21:43

jrife added 2 commits July 27, 2025 15:42

jrife force-pushed the jrife/bpf-bench branch from 626a5be to ceb4386 Compare July 27, 2025 22:57

jrife marked this pull request as ready for review July 27, 2025 23:26

jrife requested review from a team as code owners July 27, 2025 23:26

jrife requested a review from borkmann July 27, 2025 23:26

joestringer added the release-note/ci This PR makes changes to the CI. label Aug 4, 2025

maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Aug 4, 2025

joestringer reviewed Aug 4, 2025

View reviewed changes

pchaigno added area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. kind/performance There is a performance impact of this. labels Aug 11, 2025

jrife mentioned this pull request Aug 13, 2025

bpf: fix: Simplify and fix test structure validation logic #41139

Merged

github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Sep 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] bpf: Introduce `BENCH` test directive for BPF micro benchmarks #40759

[RFC] bpf: Introduce `BENCH` test directive for BPF micro benchmarks #40759

Uh oh!

jrife commented Jul 27, 2025 •

edited

Loading

Uh oh!

This comment was marked as resolved.

joestringer left a comment

Uh oh!

joestringer Aug 4, 2025

Uh oh!

jrife Aug 5, 2025

Uh oh!

joestringer Aug 4, 2025

Uh oh!

jrife Aug 5, 2025

Uh oh!

joestringer Aug 4, 2025

Uh oh!

joestringer Aug 4, 2025

Uh oh!

joestringer Aug 4, 2025

Uh oh!

joestringer Aug 4, 2025

Uh oh!

jrife Aug 5, 2025

Uh oh!

joestringer Aug 5, 2025

Uh oh!

joestringer Aug 4, 2025

Uh oh!

jrife commented Aug 6, 2025

Uh oh!

github-actions bot commented Sep 11, 2025

Uh oh!

Uh oh!

[RFC] bpf: Introduce BENCH test directive for BPF micro benchmarks #40759

Are you sure you want to change the base?

[RFC] bpf: Introduce BENCH test directive for BPF micro benchmarks #40759

Uh oh!

Conversation

jrife commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

joestringer left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jrife commented Aug 6, 2025

Uh oh!

github-actions bot commented Sep 11, 2025

Uh oh!

Uh oh!

[RFC] bpf: Introduce `BENCH` test directive for BPF micro benchmarks #40759

[RFC] bpf: Introduce `BENCH` test directive for BPF micro benchmarks #40759

jrife commented Jul 27, 2025 •

edited

Loading