-
Notifications
You must be signed in to change notification settings - Fork 37.7k
bench: update nanobench add -min_time
#23025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice--thank you for working on this! Using the min duration arg does improve stability for me (lower err%) and is convenient as a runtime option.
Tested Concept/Approach ACK with various suggestions below.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
354fbc0
to
b901174
Compare
Thanks for the reviews! I've incorporated all the suggestions into the right commits and rebased everything. |
-min_duration_ms
-min_time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost-ACK b901174
b901174
to
e7e8900
Compare
Rebased with header cleanups, and signed all commits. |
ACK e7e8900 |
Most importantly, this update fixes a bug in nanobench that always disabled performance counters on linux. It also adds another sanitizer suppression that is caught in clang++ 12.
Introduced in bitcoin#19055, MuHashDiv benchmark used to multiply with a loop based on epochIterations. That does not do what it is supposed to do, because epochIterations() is determined automatically from nanobench. Also, multiplication is not needed for the algorithm (as pointed out by a comment in bitcoin#19055), so it's better to remove this loop.
Moves some of the setup into the benchmark loop so it is possible to run the loop for an arbitrary number of times. Due to recent optimizations in bitcoin#22974 the benchmark now runs much faster, so the inner loop now calls Good() 32 times as often to get better numbers. Renamed the benchmark to AddrManAddThenGood because that's now what is actually tested. To get the the number of just Good(), one needs to subtract the benchmark result of AddrManAdd.
Moves copying of the setup into the benchmark loop so it is possible to run the loop for an arbitrary number of times. The overhead due to copying the candidates inside the loop is about 3%.
When it is not easily possible to stabilize benchmark machine and code the argument -min_time can be used to specify a minimum duration that a benchmark should take. E.g. choose -min_time=1000 if you are willing to wait about 1 second for each benchmark result. The default is now set to 10ms instead of 0, which should make runs on fast machines more stable with negligible slowdown.
This adds some usage description with tips to `bench_bitcoin -h`.
Drops unneeded and adds missing includes
- use ALLOW_BOOL for -list arg instead of ALLOW_ANY - touch up `-asymptote=<n1,n2,n3...>` help - pack Args struct a bit more efficiently - handle args in alphabetical order
The benchmarks can now run much longer due to the minimum of 10ms or directly with -min_time. With -min_time=20000 I could trigger two ubsan errors in the benchmarks, which are fixed in this commit by using unsigned type and adding "& 0xFF".
e7e8900
to
e148a52
Compare
re-ACK e148a52 The Win64 CI failure looks unrelated.
If you retouch, maybe order the addrman benchmark functions alphabetically, which would be the same order as the bench result output and place the two similar functions, AddrManAdd and AddrManAddThenGood, next to each other. |
@jonatack Interesting that your benchmark results are so different. Did you compile for release? I can't believe my computer is ~20 times faster than yours... I compile with |
@martinus no, it's a debug build and the computer is not optimized. Re-building a non-debug build would take a long time (I have a slow 2-core laptop) and I'd already tested it properly above. Sorry for the weird example. |
Code review ACK e148a52 |
This PR updates the nanobench with the latest release from upstream, v4.3.6. It fixes the missing performance counters.
Due to discussions on #22999 I have done some work that should make the benchmark results more reliable. It introduces a new flag
-min_time
that allows to run a benchmark for much longer then the default. When results are unreliable, choosing a large timeframe here should usually get repeatable results even when frequency scaling cannot be disabled. The default is now 10ms. For this to work I have changed theAddrManGood
andEvictionProtection
benchmarks so they work with any number of iterations.Also, this adds more usage documentation to
bench_bitcoin -h
and I've cherry-picked two changes from #22999 authored by Jon Atack