-
Notifications
You must be signed in to change notification settings - Fork 37.8k
Optimize reindex #7917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize reindex #7917
Conversation
Good to see this!
Could this (theoretically) be avoided by just reading the header during the initial scan? after all, the extra header that we tack on includes the size of the block, so we could skip the rest the first time. |
@laanwj Yes, I've considered that. That could lead to an inconsistent block index though (as in: I'm not sure whether there are any internal assumptions that all entries in the block index are valid). |
Nice. Concept ACK. |
@laanwj Another problem with that approach: say you get a block corrupted near the chain tip on disk, you find it and do a reindex. You expect it to reindex up to that corrupted block and then start back from network. Instead, if you don't do transaction validation during the rebuild loop, it may happily add the block to the index, and have activation fail (making it mark the chain as invalid). |
Yes, it would be quite more complex to handle those issues well. |
@jonasschnelli If you feel like benchmarking, an interesting thing to test is (reindex on master) vs (deleting chainstate on this PR), both with signature checks hackishly disabled. Deleting of chainstate caused a very slow, foreground, and (before #7821) uninterruptible process earlier. With this PR is should be (nearly) the same speed as an old reindex. |
Would it be useful to have a -weakreindex or -rebuild now, which does not throw out the block index, and only the chainstate? That would be faster, as it wouldn't need the rebuild phase. |
Would need better reporting about what database is corrupted for that to be useful to end-users. But even without that I can see that being very useful for benchmarking the UTXO database. I like the idea of being able to rebuild databases separately. What about the other way around? If you have a UTXO set, but want to reindex the block chain? (e.g. when switching a pruned node to a non-pruned one by copying block data from another node) If it was up to me I wouldn't use the name 'weakindex' or 'rebuid' (that's another one for the rescan, reindex, re.... category of vaguenes) but call it what it is: |
You can just overwrite your blocks/ database with another version (including its index), as long as it has your currently active chain in it. |
|
Added a commit to implement -reindex-chainstate. I guess the semantics for -reindex-blockindex could be:
|
That hypothetical option would rebuild the blockindex database, say after some new-fangled corruption check detects that that database was corrupted.
But don't touch the chain state in any way, it is assumed to be still good. If the best chain is not included after reindexing the blocks, throw another error that a
Yes, something like that, sounds good to me. |
2bd05b3
to
691cbc9
Compare
concept ACK |
I tried implementing -reindex-blockindex, but it is significantly harder: it would mean first initializing with a dummy chainstate, then reindexing the blockindex while leaving the original chainstate untouched, then stopping the node, loading the real chainstate, marking the blocks along the best chain as valid (or fail if not found anymore), and starting the node again. |
Going to test this |
Not sure if this is expected behavior, it's probably harmless; but during the "Reindexing block file" initial scan, rev*.dat files are created with size 0:
Edit: the block counts are off? possibly due to out-of-order blocks not being counted. It looks a bit strange but probably harmless:
|
Did a reindex of my test chain using this patch:
Times:
For comparison, a sync from a local node on the same machine, under the same conditions, takes 05:32:29 (19949s). That's very good. I have not tried doing a 'traditional' reindex, Edit: also verified that this fixes #7038. Also tested that interrupting the reindex then restarting works, both in the first phase and second phase. Edit.2: also did a "traditional" reindex - it took 04:45:49 (17149s). Only a bit longer. Note that this is a fast CPU with 8 cores, so possibly the difference with not validating is less than it would otherwise be. Edit.3: verified that utxo set hash is the same after old reindex versus new reindex (just to be sure). |
// scan for better chains in the block chain database, that are not yet connected in the active best chain | ||
CValidationState state; | ||
if (!ActivateBestChain(state, chainparams)) | ||
LogPrintf("Failed to connect best block"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this error be fatal?
Tested on a mainnet node: jonasschnelli@bitcoinsrv:~/.bitcoin$ cat debug.log | grep "Bitcoin version\|Reindexing finished\|height=409149"
2016-04-27 14:11:51 Bitcoin version v0.12.99.0-fb682d7 (2016-04-27 15:34:26 +0200)
2016-04-27 14:34:17 Reindexing finished
2016-04-27 18:52:50 UpdateTip: new best=000000000000000000c2963b062bebef4beff8ef38ae5d81e7ccfa8e05abcb1a height=409149 version=0x00000004 log2_work=84.559763 tx=125119626 date='2016-04-27 14:05:06' progress=0.999875 cache=4.0MiB(0tx) = Times:
Full debug log with |
Added GUI progress reporting |
utACK fb8fad1 |
utACK d3d7547 |
very untested ACK b4d24e1 (sanity not guaranteed) |
On a related note access to chainActive immediately after threadGroup.create_thread(boost::bind(&ThreadImport, vImportFiles)); is not protected by cs_main |
b4d24e1 Report reindexing progress in GUI (Pieter Wuille) d3d7547 Add -reindex-chainstate that does not rebuild block index (Pieter Wuille) fb8fad1 Optimize ActivateBestChain for long chains (Pieter Wuille) 316623f Switch reindexing to AcceptBlock in-loop and ActivateBestChain afterwards (Pieter Wuille) d253ec4 Make ProcessNewBlock dbp const and update comment (Pieter Wuille)
Revert this when bitcoin/bitcoin#7917 is merged.
b4d24e1 Report reindexing progress in GUI (Pieter Wuille) d3d7547 Add -reindex-chainstate that does not rebuild block index (Pieter Wuille) fb8fad1 Optimize ActivateBestChain for long chains (Pieter Wuille) 316623f Switch reindexing to AcceptBlock in-loop and ActivateBestChain afterwards (Pieter Wuille) d253ec4 Make ProcessNewBlock dbp const and update comment (Pieter Wuille)
…tChain + optimization 50dbec5 Add unit tests for signals generated by ProcessNewBlock() (furszy) f68251d Validation: rename one of the two instances using "bad-prevblk" to its correct description of "prevblk-not-found" (furszy) a51a755 Fix concurrency-related bugs in ActivateBestChain (Jesse Cohen) 198f435 Do not unlock cs_main in ABC unless we've actually made progress. (Matt Corallo) 8d15cf5 Optimize ActivateBestChain for long chains (Pieter Wuille) 8640be1 Fix fast-shutdown crash if genesis block was not loaded (Matt Corallo) ef24337 Hold cs_main while calling UpdatedBlockTip() and ui.NotifyBlockTip (Jesse Cohen) f9d2ab3 Update ValidationInterface() documentation to explicitly specify threading and memory model (Jesse Cohen) cb9bb25 Update documentation for SingleThreadedSchedulerClient() to specify the memory model (Jesse Cohen) 4ea2048 Add Unit Test for SingleThreadedSchedulerClient (Jesse Cohen) Pull request description: Made some back ports and adaptations to validate further the work introduced in #2203 and #2209. Now that the scheduler thread is processing most of the chain events, the callbacks execution order must be verified to dispatch the events serially, ensuring a consistent memory state in each event processing invocation. There are some concurrency issues solved, as well a an optimization for larger chains for the ABC workflow. Each commit details them well. List: bitcoin#7917 (only fb8fad1) bitcoin#12988 bitcoin#13023 bitcoin#13247 bitcoin#13835 This is built on top of #2284 (and the reason that made me notice that problem). So, 2284 comes first, then this one. ACKs for top commit: Fuzzbawls: ACK 50dbec5 random-zebra: ACK 50dbec5 and merging... Tree-SHA512: 573144cc96d2caa49ed2f0887dc8c53b5aca0efd54b6a25626063ccb780da426f56b3c6b7491648e7abefc1960c82b1354a4ff2c51425bfe99adaa4394562608
zcash: cherry picked from commit fb8fad1 zcash: bitcoin/bitcoin#7917
Optimise reindex Backports bitcoin/bitcoin#7917: > Several changes: > > * Make reindex/import use `AcceptBlock` rather than `ProcessNewBlock`, > so activation of the resulting best chain is delayed. > > * That delayed activation is handled by the existing "Activating best > chain" step in the startup process, which is moved to `ThreadImport`. > > * Optimize `ActivateBestChain` to not walk the entire chain to find the > most work chain to connect for each block (O(n^2) -> O(n)). > > This has several advantages: > > * As the actual activation is done after all blocks have been added > back to the index, it gets to use the checkpoints information (better > progress indicator, skipping of signature checks, ...). > > * Having a block index with many unactivated blocks (for example, after > deleting the chainstate directory) becomes _much_ faster > > * That case also runs in the background now (as it's simply treated as > part of the importing process). > > Downsides: > > * All blocks are read twice from disk during reindex (once to rebuild > the index, once to activate).
@@ -24,6 +24,7 @@ | |||
class CBlockIndex; | |||
|
|||
static const int64_t nClientStartupTime = GetTime(); | |||
static int64_t nLastHeaderTipUpdateNotification = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable seems to be initialized here, but there's nowhere in the code where it's ever updated, and yet it is checked as if the value might change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use more than 5 year old pull requests to ask beginner programming questions.
You can open a new issue, if you are absolutely unable to find the answer yourself.
In this case it is updated via a reference (a reference is like a pointer).
Several changes:
AcceptBlock
rather thanProcessNewBlock
, so activation of the resulting best chain is delayed.ThreadImport
.ActivateBestChain
to not walk the entire chain to find the most work chain to connect for each block (O(n^2) -> O(n)).This has several advantages:
Downsides:
Todo: