Skip to content

prune: scan and unlink already pruned block files on startup #26533

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 28, 2023

Conversation

andrewtoth
Copy link
Contributor

@andrewtoth andrewtoth commented Nov 18, 2022

There are a few cases where we can mark a block and undo file as pruned in our block index, but not actually remove the files from disk.

  1. If we call FindFilesToPrune or FindFilesToPruneManual and crash before UnlinkPrunedFiles.
  2. If on Windows there is an open file handle to the file somewhere else when calling fs::remove in UnlinkPrunedFiles (https://en.cppreference.com/w/cpp/filesystem/remove, https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-deletefilew#remarks). This could be from another process, or if we are calling ReadBlockFromDisk/ReadRawBlockFromDisk without having a lock on cs_main (which has been allowed since ccd8ef6).

This PR mitigates this by scanning all pruned block files on startup after LoadBlockIndexDB and unlinking them again.

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 18, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK pablomartin4btc, furszy, theStack, achow101
Concept ACK hernanmarino, MarcoFalke

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #27125 (refactor, kernel: Decouple ArgsManager from blockstorage by TheCharlatan)
  • #27039 (blockstorage: do not flush block to disk if it is already there by pinheadmz)
  • #26664 (refactor: make some BlockManager members const by promag)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Contributor

@hernanmarino hernanmarino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

Copy link
Member

@pablomartin4btc pablomartin4btc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cr & tested ACK. I've also checked a bit of the history if your proposal in all these PRs #26308, #26316 (closed in favour of this current one), #17161 and #25232.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Concept ACK

@andrewtoth andrewtoth force-pushed the scan-and-unlink-pruned-files branch from 3b0427f to 9cac322 Compare December 6, 2022 17:31
@andrewtoth
Copy link
Contributor Author

@MarcoFalke done.

@andrewtoth andrewtoth force-pushed the scan-and-unlink-pruned-files branch from 9cac322 to cd136f6 Compare December 6, 2022 18:16
Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to ignore the nit

@andrewtoth andrewtoth force-pushed the scan-and-unlink-pruned-files branch from e2a1e14 to 5b3a583 Compare December 20, 2022 14:07
@andrewtoth andrewtoth marked this pull request as draft December 20, 2022 15:13
@andrewtoth andrewtoth force-pushed the scan-and-unlink-pruned-files branch 3 times, most recently from 0f6044a to 4fb38cd Compare December 20, 2022 16:01
@andrewtoth andrewtoth force-pushed the scan-and-unlink-pruned-files branch from dd061ae to 886ef38 Compare December 20, 2022 17:26
@andrewtoth andrewtoth marked this pull request as ready for review December 20, 2022 18:47
@andrewtoth andrewtoth force-pushed the scan-and-unlink-pruned-files branch from 886ef38 to 3141eab Compare January 5, 2023 22:04
Copy link
Member

@pablomartin4btc pablomartin4btc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK with added functional test 3141eab.

Copy link
Member

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 3141eab
Left a non-blocking nit. No need to do it.

Comment on lines +579 to +583
const bool removed_blockfile{fs::remove(BlockFileSeq().FileName(pos), ec)};
const bool removed_undofile{fs::remove(UndoFileSeq().FileName(pos), ec)};
if (removed_blockfile || removed_undofile) {
LogPrint(BCLog::BLOCKSTORE, "Prune: %s deleted blk/rev (%05u)\n", __func__, *it);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could make use of the std::error_code:
e.g.

else {
   LogPrint(BCLog::BLOCKSTORE, "Prune: failed to removed block/undo file, error %s", ec.message());
}

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

Looks good to me, planning to do a more thorough review tomorrow. A (non-blocking) suggestion on how to write the functional test shorter. In particular, the sync_blocks call is basically a no-op if we only have one node:

diff --git a/test/functional/feature_remove_pruned_files_on_startup.py b/test/functional/feature_remove_pruned_files_on_startup.py
index ca0e5ace9..5df101863 100755
--- a/test/functional/feature_remove_pruned_files_on_startup.py
+++ b/test/functional/feature_remove_pruned_files_on_startup.py
@@ -13,11 +13,9 @@ class FeatureRemovePrunedFilesOnStartupTest(BitcoinTestFramework):
         self.extra_args = [["-fastprune", "-prune=1"]]
 
     def mine_batches(self, blocks):
-        n = blocks // 250
-        for _ in range(n):
+        for _ in range(blocks // 250):
             self.generate(self.nodes[0], 250)
         self.generate(self.nodes[0], blocks % 250)
-        self.sync_blocks()
 
     def run_test(self):
         blk0 = os.path.join(self.nodes[0].datadir, self.nodes[0].chain, 'blocks', 'blk00000.dat')
@@ -25,10 +23,8 @@ class FeatureRemovePrunedFilesOnStartupTest(BitcoinTestFramework):
         blk1 = os.path.join(self.nodes[0].datadir, self.nodes[0].chain, 'blocks', 'blk00001.dat')
         rev1 = os.path.join(self.nodes[0].datadir, self.nodes[0].chain, 'blocks', 'rev00001.dat')
         self.mine_batches(800)
-        fo1 = os.open(blk0, os.O_RDONLY)
-        fo2 = os.open(rev1, os.O_RDONLY)
-        fd1 = os.fdopen(fo1)
-        fd2 = os.fdopen(fo2)
+        fd1 = os.fdopen(os.open(blk0, os.O_RDONLY))
+        fd2 = os.fdopen(os.open(rev1, os.O_RDONLY))
         self.nodes[0].pruneblockchain(600)
 
         # Windows systems will not remove files with an open fd

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-review ACK 3141eab

FWIW, was curious how calling fs::remove on thousands of non-existing files would affect the performance (on mainnet, there is currently >3400 .blk/.rev files). But as expected, it's not a big deal:

$ cat remove_test.cpp                            
#include <filesystem>
#include <string>
int main() {
    for (int i = 0; i < 10000; ++i) {
        std::string filename = "/home/thestack/" + std::to_string(i) + ".dat";
        std::filesystem::remove(filename);
    }
}
$ c++ -std=c++17 -o remove_test remove_test.cpp  
$ time ./remove_test                                                                                                        
    0m00.08s real     0m00.01s user     0m00.07s system

@achow101
Copy link
Member

ACK 3141eab

@achow101 achow101 merged commit bb136aa into bitcoin:master Feb 28, 2023
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Feb 28, 2023
@andrewtoth andrewtoth deleted the scan-and-unlink-pruned-files branch August 17, 2023 20:41
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request May 6, 2024
Summary:
There are a few cases where we can mark a block and undo file as pruned in our block index, but not actually remove the files from disk.
- If we call FindFilesToPrune or FindFilesToPruneManual and crash before UnlinkPrunedFiles.
- If on Windows there is an open file handle to the file somewhere else when calling fs::remove in UnlinkPrunedFiles.  This could be from another process, or if we are calling ReadBlockFromDisk/ReadRawBlockFromDisk without having a lock on cs_main (allowed since D2979).

This PR mitigates this by scanning all pruned block files on startup after LoadBlockIndexDB and unlinking them again.

This is a backport of [[bitcoin/bitcoin#26533 | core#26533]]

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Subscribers: Fabien

Differential Revision: https://reviews.bitcoinabc.org/D16102
@bitcoin bitcoin locked and limited conversation to collaborators Aug 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants