-
Notifications
You must be signed in to change notification settings - Fork 3.4k
[RFC] bpf: Introduce BENCH
test directive for BPF micro benchmarks
#40759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
Prepare for further refactoring by cleaning up the test structure validation logic. First, simplify the logic a bit to get rid of the secondary regex match inside checkProgExists and instead just defer the existence check for progs.checkProg until after the first pass. Just Ensure uniqueness of each setup/pktgen/check program along the way with checkProgUnique. Before, the logic would not catch if a check is not defined for a test until later when a panic occurs, since progs.checkProg would be nil, but now existence of a CHECK program is enforced. Don't rely on the program name at all, only the section names. Before, weird edge cases could occur with strings.Replace() if the test program name itself had the words "setup", "pktgen", or "check" in them. This change uncovered the case with nat4_port_allocation where the test name in the CHECK for the nat4_port_allocation_tcp and nat4_port_allocation_udp tests was wrong, so fix this. Signed-off-by: Jordan Rife <jrife@google.com>
BENCH
for BPF microbenchmarks (WIP)BENCH
test directive for BPF micro benchmarks
BENCH
test directive for BPF micro benchmarksBENCH
test directive for BPF micro benchmarks
Prepare for the next patch by moving some of the logic from bpf_test.go into helpers.go and exporting it as a set of reusable functions and types: * ForEachSuite allows you to iterate over all BPF unit test collections one by one, transparently handling file reads and BPF collection preprocessing. * CollectionToSuite converts a CollectionSpec and Collection for a BPF test into a set of Test structs. It performs all the validation on the test program structure to ensure the existence and uniqueness of setup/pkg/check programs for each test. * Test contains all the logic for executing a BPF unit test, including the execution of its setup, pktgen, and check programs. Signed-off-by: Jordan Rife <jrife@google.com>
Introduce BENCH for implementing BPF micro benchmarks. bpftest (make run_bpf_tests) ignores BENCH programs; they are only executed when running bpfbench (make run_bpf_benchmarks). If a PKTGEN program is defined, it is run before the BENCH program to generate its input, but SETUP and CHECK programs are ignored. ``` jordan@t14:~/code/cilium$ make run_bpf_benchmarks DOCKER_ARGS=--privileged RUN_AS_ROOT=1 contrib/scripts/builder.sh \ make -j16 -C bpf/tests/ run_bench \ "BPF_TEST_FILE=""" make: Entering directory '/go/src/github.com/cilium/cilium/bpf/tests' go run ./bpfbench \ -bpf-test-path /go/src/github.com/cilium/cilium/bpf/tests \ tc_egressgw_redirect_bench_example 1000000 1328 ns/op make: Leaving directory '/go/src/github.com/cilium/cilium/bpf/tests' ``` Signed-off-by: Jordan Rife <jrife@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Overall LGTM, seems like a nice improvement to have better infrastructure for this use case. I had some super minor nits below, nice-to-haves for consistency. The PR will need a rebase anyway.
@@ -554,6 +554,11 @@ run_bpf_tests: ## Build and run the BPF unit tests using the cilium-builder cont | |||
"JUNIT_PATH=$(JUNIT_PATH)" \ | |||
"V=$(BPF_TEST_VERBOSE)" | |||
|
|||
run_bpf_benchmarks: ## Build and run the BPF benchmarks tests using the cilium-builder container image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: bench-bpf
to follow the existing bench prefix for benchmark targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I'll also move the target up there to group all the bench
stuff together.
@@ -50,6 +50,7 @@ clean: | |||
rm -f $(wildcard *.d) | |||
|
|||
BPF_TEST_FLAGS:= $(GO_TEST_FLAGS) | |||
BPF_BENCH_FLAGS:= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reuse BENCHFLAGS
instead or as a basis for this variable? We could move the definition from Makefile
to Makefile.defs
for better sharing across different makefiles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, but I'll check.
data := make([]byte, 4096-256-320) | ||
|
||
// ctx is only used for tc programs | ||
// non-empty ctx passed to non-tc programs will cause error: invalid argument | ||
ctx := make([]byte, 0) | ||
if tt.benchProg.Type() == ebpf.SchedCLS { | ||
// sizeof(struct __sk_buff) < 256, let's make it 256 | ||
ctx = make([]byte, 256) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a few magic constants here that seem like they could be better documented through declaration as vars with commetns that reference the corresponding Linux kernel source to explain why these are the right numbers (or hint to someone who has problems how to adjust them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(realizing upon subsequent review these magic constants were already here >.<)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This second commit seems to be doing a little bit more than a straight refactor, which makes it harder to review. Ideally a commit should be strictly moving code around or changing the logic of the code so we can better evaluate the functional changes and not lose them amongst 800 other lines of code. That said at least so far it seems pretty similar with some minor improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah looks like in the end there was a log statement that didn't get copied over and a couple of minor function signature changes which had the same logic just expressed in a better way. Overall the commit makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This second commit seems to be doing a little bit more than a straight refactor, which makes it harder to review
I'll break this up into some more granular transformations to make it clearer. Probably something like,
commit n: Create the skaffolding to migrate some chunk of code from A to B
commit n+1: Move chunk of code from A to B
... rinse and repeat until the state of the codebase matches what is currently all done in commit 2.
I agree the transformations should be dead simple to understand at a glance; I wanted to gauge general interest first before doing git surgery on this bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense yeah. At least to me this seems helpful, but I'm not actively pursuing datapath benchmarking. If someone else in @cilium/sig-datapath is, I'd encourage @cilium/sig-datapath to try this out and provide feedback.
continue | ||
} | ||
|
||
t.Logf("Skipping program '%s' of type '%s': BPF_PROG_RUN not supported", p.Name, p.Type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we still log this to provide some breadcrumbs for people running the tests as to why some programs are skipped? Just concerned that people might draw the wrong conclusions from the test runner silently skipping some progs.
Stray thought, a logical continuation of this work would seem to be allowing runtime program stats to be gathered on demand and exported as metrics (if this isn't already done somewhere in Cilium). I can see this being quite valuable for us in doing profiling of live systems. |
This pull request has been automatically marked as stale because it |
Recently, I added the
BENCH
probe type to bpftrace (bpftrace/bpftrace#4343) for implementing BPF micro benchmarks. I thought that something similar could be handy in Cilium as an extension to the BPF unit test framework for benchmarking various programs, gauging the performance effects of BPF changes, etc. and wondered if others would find it useful. Micro benchmarks are never the full picture when it comes to performance, but still, it could be a nice tool.So,
Introduce
BENCH
for implementing BPF micro benchmarks. bpftest (make run_bpf_tests
) ignoresBENCH
programs; they are only executed when running bpfbench (make run_bpf_benchmarks
). If aPKTGEN
program isdefined, it is run before the
BENCH
program to generate its input, butSETUP
andCHECK
programs are ignored.As part of this change, refactor bpf_test.go a bit to extract the useful bits and make them reusable to bpfbench. I considered just putting everything inside bpf_test.go, maybe piggybacking on top of Go's
Benchmark*
s somehow, but found this approach unworkable. This lead me down the path of implementing benchmarking as a separate tool and making some of the functionality shared between bpftest and bpfbench. I would have liked to usetparse
for output formatting, but at present it doesn't support reporting for benchmarks, so the output is currently quite raw.(Coincidentally, mfridman/tparse#67 was reported by @joestringer a few years ago, but alas, it has not yet been implemented)