-
Notifications
You must be signed in to change notification settings - Fork 37.7k
refactor: Make mapBlocksUnknownParent local, and rename it #25571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Make mapBlocksUnknownParent local, and rename it #25571
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 3bd8416
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
-
This PR changes the scope of mapBlocksUnknown parent to the local variable of the caller function of LoadExternalBlockFile.
-
I agree with converting the variable from a
static global
type to alocal
type.The current situation is somewhat similar to a memory leak because this map exists unused for the remaining lifetime of the process. It's true that this map should be empty of data elements after use, but its internal metadata (indexing structures, etc.) can have non-trivial size because there can be many thousands of simultaneous elements in this map.
I agree with the reasoning, and this seems to cause an OOM error during fuzz testing.
-
I verified that the code refactoring was concise and clear. The code changes are purely refactoring modifications and do not introduce and change in code functionality.
-
The added documentation for the LoadExternalBlockFile, clearly explains the usage of the function and the reasoning for the used type of the arguments.
-
I was able to compile and run this PR on Ubuntu 22.04 successfully. And I was able to reindex on the signet network using the following command successfully
./src/bitcoind -signet -reindex
Since, according to the guidelines, we need to include all the headers directly used in a file, I would suggest two instances of such inclusion.
#include <map>
in src/node/blockstorage.cpp and src/test/fuzz/load_external_block_file.cpp file.#include <cassert>
in src/validation.cpp file.
Concept ACK |
3bd8416
to
d6bbb65
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code-review ACK d6bbb65 (also tested -reindex
ing on signet)
Nice improvement! Probably it's worth it to investigate if the code-base contains more instances of static
variables inside functions that don't need to exist for the whole lifetime of the process and can thus be eliminiated by using a limited scope instead.
d6bbb65
to
e963615
Compare
Force-pushed to address review comment (change variable name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK e963615
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reACK e963615
Changes since my last review:
- Renamed
dps
->dbp
- Added the headers
<map>
and<cassert>
where they were used. - Rewritten assert condition to make the code simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review ACK e963615
I agree that it is good to remove the global map (I think this could also improve memory usage a little bit if the -reindex
didn't finish successfully in case of block file corruption, and we switched to IBD without being able to connect and remove all entries from the map).
I don't get how this could fix a fuzzer OOM error, #19594 (comment) also doesn't elaborate on this.
@@ -834,6 +835,9 @@ void ThreadImport(ChainstateManager& chainman, std::vector<fs::path> vImportFile | |||
// -reindex | |||
if (fReindex) { | |||
int nFile = 0; | |||
// Map of disk positions for blocks with unknown parent (only used for reindex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (no need to re-push for that): "(only used for reindex)" is obvious and could be dropped now that this got moved into a code block conditional on fReindex
.
The fuzzer calls It's easy to reproduce the OOM problem without causing an actual OOM (which could take a long time depending on your system's memory resources):
and while that's running, in another window, monitor its memory usage. I ran
(Then I stopped the test, clearly it continuously grows.) With this PR, there's expected startup growth, then it's unchanging:
|
@practicalswift @promag @amitiuttarwar |
Co-authored-by: Larry Ruane <larryruane@gmail.com>
e963615
to
dd065da
Compare
Force-pushed rebase for merge conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK dd065da
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK dd065da |
07dfbb5 Make static nLastFlush and nLastWrite Chainstate members (Aurèle Oulès) Pull request description: Fixes #22189. The `static std::multimap<uint256, FlatFilePos> mapBlocksUnknownParent; ` referenced in the issue was already fixed by #25571. I don't believe Chainstate references any other static variables. ACKs for top commit: jamesob: ACK 07dfbb5 ([`jamesob/ackr/26513.1.aureleoules.make_static_nlastflush_a`](https://github.com/jamesob/bitcoin/tree/ackr/26513.1.aureleoules.make_static_nlastflush_a)) theStack: Concept and code-review ACK 07dfbb5 Tree-SHA512: 0f26463c079bbc5e0e62707d4ca4c8c9bbb99edfa3391d48d4915d24e2a1190873ecd4f9f11da25b44527671cdc82c41fd8234d56a4592a246989448d34406b0
This PR is a second attempt at #19594. This PR has two motivations:
mapBlocksUnknownParent
A minor added advantage is to release
mapBlocksUnknownParent
memory when the reindexing phase is done. The current situation is somewhat similar to a memory leak because this map exists unused for the remaining lifetime of the process. It's true that this map should be empty of data elements after use, but its internal metadata (indexing structures, etc.) can have non-trivial size because there can be many thousands of simultaneous elements in this map.This PR helps our efforts to reduce the use of global variables. This variable isn't just global, it's hidden inside a function (it looks like a local variable but has the
static
attribute).This global variable exists because the
-reindex
processing code callsLoadExternalBlockFile()
multiple times (once for each block file), but that function must preserve some state between calls (themapBlocksUnknownParent
map). This PR fixes this by allocating this map as a local variable in the caller's scope and passing it in on each call. When reindexing completes, the map goes out of scope and is deallocated.I tested this manually by reindexing on mainnet and signet. Also, the existing
feature_reindex.py
functional test passes.