-
Notifications
You must be signed in to change notification settings - Fork 37.8k
bench: Add logging benchmark #18815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bench: Add logging benchmark #18815
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Concept ACK |
Would it make sense to also benchmark the logging setup we use in the fuzz tests to avoid logging to disk? static TestingSetup setup{
CBaseChainParams::REGTEST,
{
"-nodebuglogfile",
},
}; |
To be honest, I think the idea of logging being performance critical is a bit scary. It was never meant like this. We shouldn't log in inner loops and such. Sure, debug flags are enabled it's different but one can expect a performance hit in that case. It's not really something to optimize for IMO. (not opposed to adding a benchmark, but I just think it's not a very high priority thing to optimize and a benchmark encourages this) Edit: I understand it now in the context of #18678 |
Jup, this is not primarily to encourage people to optimize this function, but rather to make finding performance bottlenecks (#18538 (comment)) or regressions (#17218) easier. |
I think a big issue with the logging is that it does a bunch of string concatenation and allocation for formatting, instead of format strings+printf. When you do this in a tight loop/IBD, I'm not surprised that you would see performance issues. My guess for the thread_local issue is just the fact that it has to allocate and concatenate even more when it formats threads names into the log. |
The strings are written to disk unbuffered: Line 56 in 844d207
If I had to guess, even with SSD this might be the dominant factor, and not memory allocation for strings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fa22a52 (jamesob/ackr/18815.1.MarcoFalke.bench_add_logging_benchm
)
Built and tested with ./src/bench/bench_bitcoin -filter='.*Logging.*'
.
Seems like a no-brainer to have this benchmark data on hand to avoid embarrassing
regressions due to some quirk in the logging system.
Worth noting that some effort was made previously to move logging to a separate
thread (#13200), though I think that implementation did do string interpolation
inline (synchronously). I had tried to move interpolation to the logging
thread, but the difficulty of __VA_ARGS__
and getting the right function
signatures deterred me; may be doable though.
Show platform data
Tested on Linux-5.3.0-51-generic-x86_64-with-Ubuntu-18.04-bionic
Configured with ./configure LDFLAGS=-L/home/james/src/bitcoin/db4/lib/ CPPFLAGS=-I/home/james/src/bitcoin/db4/include/ CXXFLAGS=-std=c++11 -Werror=thread-safety-analysis --enable-wallet --enable-debug --with-daemon CXX=/usr/bin/clang++-4.0 CC=/usr/bin/clang-4.0 PYTHONPATH= --disable-shared --with-pic --enable-benchmark=no --with-bignum=no --enable-module-recovery --disable-jni --disable-shared --with-pic --enable-benchmark=no --with-bignum=no --enable-module-recovery --disable-jni --no-create --no-recursion
Compiled with /usr/bin/ccache /usr/bin/clang++-4.0 -std=c++11 -mavx -mavx2 -std=c++11 -Werror=thread-safety-analysis -O0 -g3 -ftrapv -Wstack-protector -fstack-protector-all -msse4 -msha -msse4.1 -msse4.2 i
Compiler version: clang version 4.0.1-10 (tags/RELEASE_401/final)
Show signature data
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
ACK fa22a52f2aae3d1962a386d973de192e0de8fd65 ([`jamesob/ackr/18815.1.MarcoFalke.bench_add_logging_benchm`](https://github.com/jamesob/bitcoin/tree/ackr/18815.1.MarcoFalke.bench_add_logging_benchm))
Built and tested with `./src/bench/bench_bitcoin -filter='.*Logging.*'`.
Seems like a no-brainer to have this benchmark data on hand to avoid embarrassing
regressions due to some quirk in the logging system.
Worth noting that some effort was made previously to move logging to a separate
thread (#13200), though I think that implementation did do string interpolation
inline (synchronously). I had tried to move interpolation to the logging
thread, but the difficulty of `__VA_ARGS__` and getting the right function
signatures deterred me; may be doable though.
<details><summary>Show platform data</summary>
<p>
Tested on Linux-5.3.0-51-generic-x86_64-with-Ubuntu-18.04-bionic
Configured with ./configure LDFLAGS=-L/home/james/src/bitcoin/db4/lib/ CPPFLAGS=-I/home/james/src/bitcoin/db4/include/ CXXFLAGS=-std=c++11 -Werror=thread-safety-analysis --enable-wallet --enable-debug --with-daemon CXX=/usr/bin/clang++-4.0 CC=/usr/bin/clang-4.0 PYTHONPATH= --disable-shared --with-pic --enable-benchmark=no --with-bignum=no --enable-module-recovery --disable-jni --disable-shared --with-pic --enable-benchmark=no --with-bignum=no --enable-module-recovery --disable-jni --no-create --no-recursion
Compiled with /usr/bin/ccache /usr/bin/clang++-4.0 -std=c++11 -mavx -mavx2 -std=c++11 -Werror=thread-safety-analysis -O0 -g3 -ftrapv -Wstack-protector -fstack-protector-all -msse4 -msha -msse4.1 -msse4.2 i
Compiler version: clang version 4.0.1-10 (tags/RELEASE_401/final)
</p></details>
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCgAdFiEEpm2+LMqPsWj2rAwWyrywS1A+5mgFAl6xh6UACgkQyrywS1A+
5mjoJg//ZgjGIa7dmzkSwvbUHW0sg/u2ShTCsOrWWfx6cdZ7ApQlPOH5nDcjO7yb
Qr61POE/3uA3kUrcUvVvlwqStpbwbjqNNNR5g8OklGE/N2N4Q621ze1vtmWJOuU2
ik7APJT3MlU/fOEEqRGSdS9Jow2P45CqrPTSgCjJA0oPlhTn6lrtaCtDEgT/dFR3
TJlHO/Oe53zK2wpkzUVzKXswvNLsSSPYHIIp/hFYRNanfnnmFg16YTD73sr+CrSV
K6JGkTbIwtXEIHRuIglTHRMW46U3MoAMhg2/e1L4Ssdq1oUI0aCUNC7nww066zI6
1reKeu+fkfOVLAYNb+Ke18GqOrWwBdPUut+o8bTvK43nMYZHH2eRL/4orS/WKSWG
PGV1qSBDY8F7DdtLUeIXmvWxZbCFubFFpx/zl/Ur8S9+Hr96jWilt9l8t2FOhzbF
nzCV+LkfUYhpHYFgCxtXg3px6sKQ02S3A8NyyfWTkE/LIAMKrDcAG8JJxOeuAAMm
iay0gemRuqZKA4zGrhZTc85Js0Qjpu9LAnpWSNKraGEzwbxLCNIFN9J0LvogERhn
+4RbxIyJewk5P4Sb6k+lTuegmBu3CjfyohCndfuPEGdFI/++cd/ynIXlhDIjvDxq
CeeYGoBm8SjxtEMkeBCeuw4G+tZCfB2n8m8VccknjInX7t0lQSM=
=yVlS
-----END PGP SIGNATURE-----
Is it? because |
faf0302
to
fafe06c
Compare
Might make finding performance bottlenecks or regressions (#17218) easier.
For example, fuzzing relies on disabled logging to be as fast as possible.