Skip to content

Conversation

willcl-ark
Copy link
Member

@willcl-ark willcl-ark commented Mar 22, 2023

The --enable-debug flag should default to enabling debug consistency checks. This PR enables addrman and mempool consistency checks at a rate of once per 1000 operations, by default when running in debug mode.

To opt out of the checks when running a debug build specify the runtime -checkmempool=0 and -checkaddrman=0 debug options, as described in developer-notes.md

On my machine, configuring these checks to run every 1000 operations saw them run no more than once per minute (and often much less frequently), which did not introduce any significant overhead.

This PR does not include a -checkblockindex default consistency check for mainnet when running in debug mode.

fixes #24709

@DrahtBot
Copy link
Contributor

DrahtBot commented Mar 22, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK MarcoFalke, mzumsande

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

Though, I haven't checked how expensive those checks are. If they make the program unusable slow, I can see that someone would Nack this being on by default.

What about mempool checks?

src/addrman.h Outdated
Comment on lines 30 to 35
#ifdef DEBUG_ADDRMAN
static constexpr int32_t DEFAULT_ADDRMAN_CONSISTENCY_CHECKS{1};
#else
static constexpr int32_t DEFAULT_ADDRMAN_CONSISTENCY_CHECKS{0};
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#ifdef DEBUG_ADDRMAN
static constexpr int32_t DEFAULT_ADDRMAN_CONSISTENCY_CHECKS{1};
#else
static constexpr int32_t DEFAULT_ADDRMAN_CONSISTENCY_CHECKS{0};
#endif
static constexpr int32_t DEFAULT_ADDRMAN_CONSISTENCY_CHECKS{
#ifdef DEBUG_ADDRMAN
1
#else
0
#endif
};

style nit (feel free to ignore, if you don't like it)

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK - I've run with -checkaddrman=1 before, never found bitcoind to be unusably slow. Not sure about the mempool checks though.

Generally speaking, I think it's a bit strange that --enable-debug mixes two different concepts:
1.) Compiler flags for attaching a debugger
2.) More (but costly) runtime checks (like -DDEBUG_LOCKORDER currently, plus additionally custom checks for addrman / mempool etc.)

I don't think it's very common that people would want both 1 and 2 at the same time:
If I want to attach a debugger for better analysis of a specific part of the code, I'm not interested in checks for unrelated parts - if I want to run with more checks to find possible unknown bugs, I'm not interested in attaching a debugger.
So why do we have one flag for the two?

@maflcko
Copy link
Member

maflcko commented Mar 22, 2023

Maybe there can be a --enable-debug-symbols (implied by --enable-debug) to set -g for the compiler?

@willcl-ark
Copy link
Member Author

@mzumsande FYI this PR is equivalent to running with -checkaddrman=1 which does use quite some resources as it does a full addrman consistency check on pretty much every addrman operation (Add, Good, Attempt, Get, Select, Connected etc.).

For me this manifests as saturating an entire core via a combination of the b-msghand, b-net and b-opencon threads as these all spawn consistency checks. My bitcoind process usually idles at ~5%, so to have one core @ 100% is a pretty big change, but my machine has 8 physical cores so it's also not too terrible...

I have not profiled with perf as it is so obvious in top.

I also agree that there are two debug scenarios (which are already mixed) being further mixed by this PR, i) enabling debugging symbols and ii) running the application in "debug mode".

In a vacuum I would agree with @MarcoFalke in introducing an --enable-debug-symbols flag, however in this project, with --enable-debug as the encumbent and there being a lot of developer expectation around what it does, perhaps the new flag should be --debug-mode which both implies --enable-debug and actually runs with all these debug mode options enabled by default?

src/addrman.h Outdated
@@ -27,7 +27,11 @@ class InvalidAddrManVersionError : public std::ios_base::failure
class AddrManImpl;

/** Default for -checkaddrman */
#ifdef DEBUG_ADDRMAN
static constexpr int32_t DEFAULT_ADDRMAN_CONSISTENCY_CHECKS{1};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these checks are going to be enabled by default on some builds, shouldn't the default be much higher; eg 100 or 1000? Doing it on every call is useful for live debugging or small scale tests, but seems overkill for a node's runtime default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree, the performance hit is too much for any kind of normal operation.

I updated the branch with @MarcoFalke's suggestion along with a new debug option (which I called --enable-debug-checks, and not --debug-slow).

I'll run a few runtime tests on various values for the checks to see if I can find a sweet spot for the frequency, then update the doc commit and push an update here.

@willcl-ark
Copy link
Member Author

willcl-ark commented May 22, 2023

OK pushed an update based on feedback here, in #24709 and #27586.

  • PR description updated with new details of this pull.
  • My opinion is that running these checks at a low enough frequency on mainnet debug builds is desirable and non-intrusive
  • Selecting values of 1000 for both consistency checks worked with no noticable overhead for me, and appeared to run once every minute or so, although my test node is not listening so others may see higher frequency in "production debug" builds which are listening.
  • I am unsure if mempol_args.h is the right place for DEFAULT_MEMPOOL_DEBUG_CHECKS, and open to suggestions on that.
  • Happy to sqush both doc commits into their respective build commits, or indeed squash all 4 into a single commit, if preferred.

When configured with --enable-debug run addrman consistency checks every
1000 operations by default.

Running every 1000 operations minimises the CPU overhead on the node and
keeps it viable to run a node in debug mode on mainnet.
When configured with --enable-debug run mempool consistency checks every
1000 operations by default.
@willcl-ark willcl-ark force-pushed the enable-addrman-consist branch from 7b32641 to 9c2f4ca Compare May 30, 2023 08:23
@maflcko
Copy link
Member

maflcko commented Jun 27, 2023

Is this still relevant, given that the issue was closed?

@willcl-ark
Copy link
Member Author

Nope, I also agree with the conclusions from AJ in 24709 about just setting this as a runtime option, so will close this.

@willcl-ark willcl-ark closed this Jun 27, 2023
@bitcoin bitcoin locked and limited conversation to collaborators Jun 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable consistency checks by default with --enable-debug
5 participants