Skip to content

Conversation

LarryRuane
Copy link
Contributor

@LarryRuane LarryRuane commented Jul 8, 2022

This PR is a second attempt at #19594. This PR has two motivations:

  • Improve code hygiene by eliminating a global variable, mapBlocksUnknownParent
  • Fix fuzz test OOM when running too long (see #19594 comment)

A minor added advantage is to release mapBlocksUnknownParent memory when the reindexing phase is done. The current situation is somewhat similar to a memory leak because this map exists unused for the remaining lifetime of the process. It's true that this map should be empty of data elements after use, but its internal metadata (indexing structures, etc.) can have non-trivial size because there can be many thousands of simultaneous elements in this map.

This PR helps our efforts to reduce the use of global variables. This variable isn't just global, it's hidden inside a function (it looks like a local variable but has the static attribute).

This global variable exists because the -reindex processing code calls LoadExternalBlockFile() multiple times (once for each block file), but that function must preserve some state between calls (the mapBlocksUnknownParent map). This PR fixes this by allocating this map as a local variable in the caller's scope and passing it in on each call. When reindexing completes, the map goes out of scope and is deallocated.

I tested this manually by reindexing on mainnet and signet. Also, the existing feature_reindex.py functional test passes.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 8, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #25499 (Use steady clock for all millis bench logging by MarcoFalke)
  • #16981 (Improve runtime performance of --reindex by LarryRuane)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Member

@sipa sipa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 3bd8416

Copy link
Contributor

@shaavan shaavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

  • This PR changes the scope of mapBlocksUnknown parent to the local variable of the caller function of LoadExternalBlockFile.

  • I agree with converting the variable from a static global type to a local type.

    The current situation is somewhat similar to a memory leak because this map exists unused for the remaining lifetime of the process. It's true that this map should be empty of data elements after use, but its internal metadata (indexing structures, etc.) can have non-trivial size because there can be many thousands of simultaneous elements in this map.

    I agree with the reasoning, and this seems to cause an OOM error during fuzz testing.

  • I verified that the code refactoring was concise and clear. The code changes are purely refactoring modifications and do not introduce and change in code functionality.

  • The added documentation for the LoadExternalBlockFile, clearly explains the usage of the function and the reasoning for the used type of the arguments.

  • I was able to compile and run this PR on Ubuntu 22.04 successfully. And I was able to reindex on the signet network using the following command successfully

./src/bitcoind -signet -reindex

Since, according to the guidelines, we need to include all the headers directly used in a file, I would suggest two instances of such inclusion.

  1. #include <map> in src/node/blockstorage.cpp and src/test/fuzz/load_external_block_file.cpp file.
  2. #include <cassert> in src/validation.cpp file.

@theStack
Copy link
Contributor

theStack commented Jul 8, 2022

Concept ACK

@LarryRuane LarryRuane force-pushed the 2022-07-mapBlocksUnknownParent branch from 3bd8416 to d6bbb65 Compare July 8, 2022 16:28
@LarryRuane
Copy link
Contributor Author

Force-pushed to address review comments (thanks @sipa and @shaavan!)

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-review ACK d6bbb65 (also tested -reindexing on signet)

Nice improvement! Probably it's worth it to investigate if the code-base contains more instances of static variables inside functions that don't need to exist for the whole lifetime of the process and can thus be eliminiated by using a limited scope instead.

@LarryRuane LarryRuane force-pushed the 2022-07-mapBlocksUnknownParent branch from d6bbb65 to e963615 Compare July 9, 2022 16:31
@LarryRuane
Copy link
Contributor Author

Force-pushed to address review comment (change variable name dps back to the original name dbp), thanks @theStack.

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK e963615

Copy link
Contributor

@shaavan shaavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reACK e963615

Changes since my last review:

  • Renamed dps -> dbp
  • Added the headers <map> and <cassert> where they were used.
  • Rewritten assert condition to make the code simpler.

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review ACK e963615

I agree that it is good to remove the global map (I think this could also improve memory usage a little bit if the -reindex didn't finish successfully in case of block file corruption, and we switched to IBD without being able to connect and remove all entries from the map).

I don't get how this could fix a fuzzer OOM error, #19594 (comment) also doesn't elaborate on this.

@@ -834,6 +835,9 @@ void ThreadImport(ChainstateManager& chainman, std::vector<fs::path> vImportFile
// -reindex
if (fReindex) {
int nFile = 0;
// Map of disk positions for blocks with unknown parent (only used for reindex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (no need to re-push for that): "(only used for reindex)" is obvious and could be dropped now that this got moved into a code block conditional on fReindex.

@LarryRuane
Copy link
Contributor Author

LarryRuane commented Jul 11, 2022

I don't get how this could fix a fuzzer OOM error, #19594 (comment) also doesn't elaborate on this.

The fuzzer calls LoadExternalBlockFile() many times, and without this PR, each call adds entries to the static multimap that are never removed (they would be in real life, almost always, but not with random input).

It's easy to reproduce the OOM problem without causing an actual OOM (which could take a long time depending on your system's memory resources):

$ FUZZ=load_external_block_file src/test/fuzz/fuzz

and while that's running, in another window, monitor its memory usage. I ran ps -C b-test -o rss= every 10 minutes (for some reason, the fuzz executable is called b-test when running). With the master branch (units are kbyte):

104664
112532
121912
156560

(Then I stopped the test, clearly it continuously grows.) With this PR, there's expected startup growth, then it's unchanging:

104688
109440
109440
109440

@LarryRuane
Copy link
Contributor Author

LarryRuane commented Jul 17, 2022

@practicalswift @promag @amitiuttarwar
You had previously ACKed PR #19594 by @hebasto that this PR replaces, maybe you would like to review, thanks.

Co-authored-by: Larry Ruane <larryruane@gmail.com>
@LarryRuane LarryRuane force-pushed the 2022-07-mapBlocksUnknownParent branch from e963615 to dd065da Compare July 18, 2022 18:07
@LarryRuane
Copy link
Contributor Author

Force-pushed rebase for merge conflict.

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK dd065da

Copy link
Contributor

@shaavan shaavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reACK dd065da

Changes since my last review:

  1. Rebased over the master.

@mzumsande
Copy link
Contributor

re-ACK dd065da

@fanquake fanquake merged commit 5871b5b into bitcoin:master Jul 29, 2022
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jul 29, 2022
@LarryRuane LarryRuane deleted the 2022-07-mapBlocksUnknownParent branch July 29, 2022 20:23
fanquake added a commit that referenced this pull request Dec 8, 2022
07dfbb5 Make static nLastFlush and nLastWrite Chainstate members (Aurèle Oulès)

Pull request description:

  Fixes #22189.

  The `static std::multimap<uint256, FlatFilePos> mapBlocksUnknownParent; ` referenced in the issue was already fixed by #25571. I don't believe Chainstate references any other static variables.

ACKs for top commit:
  jamesob:
    ACK 07dfbb5 ([`jamesob/ackr/26513.1.aureleoules.make_static_nlastflush_a`](https://github.com/jamesob/bitcoin/tree/ackr/26513.1.aureleoules.make_static_nlastflush_a))
  theStack:
    Concept and code-review ACK 07dfbb5

Tree-SHA512: 0f26463c079bbc5e0e62707d4ca4c8c9bbb99edfa3391d48d4915d24e2a1190873ecd4f9f11da25b44527671cdc82c41fd8234d56a4592a246989448d34406b0
@bitcoin bitcoin locked and limited conversation to collaborators Jul 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants