-
Notifications
You must be signed in to change notification settings - Fork 37.7k
prune: scan and unlink already pruned block files on startup #26533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prune: scan and unlink already pruned block files on startup #26533
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
58e5723
to
3b0427f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Concept ACK
3b0427f
to
9cac322
Compare
@MarcoFalke done. |
9cac322
to
cd136f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to ignore the nit
e2a1e14
to
5b3a583
Compare
0f6044a
to
4fb38cd
Compare
dd061ae
to
886ef38
Compare
886ef38
to
3141eab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK with added functional test 3141eab.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 3141eab
Left a non-blocking nit. No need to do it.
const bool removed_blockfile{fs::remove(BlockFileSeq().FileName(pos), ec)}; | ||
const bool removed_undofile{fs::remove(UndoFileSeq().FileName(pos), ec)}; | ||
if (removed_blockfile || removed_undofile) { | ||
LogPrint(BCLog::BLOCKSTORE, "Prune: %s deleted blk/rev (%05u)\n", __func__, *it); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could make use of the std::error_code:
e.g.
else {
LogPrint(BCLog::BLOCKSTORE, "Prune: failed to removed block/undo file, error %s", ec.message());
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
Looks good to me, planning to do a more thorough review tomorrow. A (non-blocking) suggestion on how to write the functional test shorter. In particular, the sync_blocks
call is basically a no-op if we only have one node:
diff --git a/test/functional/feature_remove_pruned_files_on_startup.py b/test/functional/feature_remove_pruned_files_on_startup.py
index ca0e5ace9..5df101863 100755
--- a/test/functional/feature_remove_pruned_files_on_startup.py
+++ b/test/functional/feature_remove_pruned_files_on_startup.py
@@ -13,11 +13,9 @@ class FeatureRemovePrunedFilesOnStartupTest(BitcoinTestFramework):
self.extra_args = [["-fastprune", "-prune=1"]]
def mine_batches(self, blocks):
- n = blocks // 250
- for _ in range(n):
+ for _ in range(blocks // 250):
self.generate(self.nodes[0], 250)
self.generate(self.nodes[0], blocks % 250)
- self.sync_blocks()
def run_test(self):
blk0 = os.path.join(self.nodes[0].datadir, self.nodes[0].chain, 'blocks', 'blk00000.dat')
@@ -25,10 +23,8 @@ class FeatureRemovePrunedFilesOnStartupTest(BitcoinTestFramework):
blk1 = os.path.join(self.nodes[0].datadir, self.nodes[0].chain, 'blocks', 'blk00001.dat')
rev1 = os.path.join(self.nodes[0].datadir, self.nodes[0].chain, 'blocks', 'rev00001.dat')
self.mine_batches(800)
- fo1 = os.open(blk0, os.O_RDONLY)
- fo2 = os.open(rev1, os.O_RDONLY)
- fd1 = os.fdopen(fo1)
- fd2 = os.fdopen(fo2)
+ fd1 = os.fdopen(os.open(blk0, os.O_RDONLY))
+ fd2 = os.fdopen(os.open(rev1, os.O_RDONLY))
self.nodes[0].pruneblockchain(600)
# Windows systems will not remove files with an open fd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code-review ACK 3141eab
FWIW, was curious how calling fs::remove
on thousands of non-existing files would affect the performance (on mainnet, there is currently >3400 .blk/.rev files). But as expected, it's not a big deal:
$ cat remove_test.cpp
#include <filesystem>
#include <string>
int main() {
for (int i = 0; i < 10000; ++i) {
std::string filename = "/home/thestack/" + std::to_string(i) + ".dat";
std::filesystem::remove(filename);
}
}
$ c++ -std=c++17 -o remove_test remove_test.cpp
$ time ./remove_test
0m00.08s real 0m00.01s user 0m00.07s system
ACK 3141eab |
Summary: There are a few cases where we can mark a block and undo file as pruned in our block index, but not actually remove the files from disk. - If we call FindFilesToPrune or FindFilesToPruneManual and crash before UnlinkPrunedFiles. - If on Windows there is an open file handle to the file somewhere else when calling fs::remove in UnlinkPrunedFiles. This could be from another process, or if we are calling ReadBlockFromDisk/ReadRawBlockFromDisk without having a lock on cs_main (allowed since D2979). This PR mitigates this by scanning all pruned block files on startup after LoadBlockIndexDB and unlinking them again. This is a backport of [[bitcoin/bitcoin#26533 | core#26533]] Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D16102
There are a few cases where we can mark a block and undo file as pruned in our block index, but not actually remove the files from disk.
FindFilesToPrune
orFindFilesToPruneManual
and crash beforeUnlinkPrunedFiles
.fs::remove
inUnlinkPrunedFiles
(https://en.cppreference.com/w/cpp/filesystem/remove, https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-deletefilew#remarks). This could be from another process, or if we are callingReadBlockFromDisk
/ReadRawBlockFromDisk
without having a lock oncs_main
(which has been allowed since ccd8ef6).This PR mitigates this by scanning all pruned block files on startup after
LoadBlockIndexDB
and unlinking them again.