Optimize reindex #7917

sipa · 2016-04-20T14:13:21Z

Several changes:

Make reindex/import use AcceptBlock rather than ProcessNewBlock, so activation of the resulting best chain is delayed.
That delayed activation is handled by the existing "Activating best chain" step in the startup process, which is moved to ThreadImport.
Optimize ActivateBestChain to not walk the entire chain to find the most work chain to connect for each block (O(n^2) -> O(n)).

This has several advantages:

As the actual activation is done after all blocks have been added back to the index, it gets to use the checkpoints information (better progress indicator, skipping of signature checks, ...).
Having a block index with many unactivated blocks (for example, after deleting the chainstate directory) becomes much faster
That case also runs in the background now (as it's simply treated as part of the importing process).

Downsides:

All blocks are read twice from disk during reindex (once to rebuild the index, once to activate).

Todo:

Better progress indicator during the index rebuild phase.

laanwj · 2016-04-20T14:20:57Z

Good to see this!

All blocks are read twice from disk during reindex (once to rebuild the index, once to activate)

Could this (theoretically) be avoided by just reading the header during the initial scan? after all, the extra header that we tack on includes the size of the block, so we could skip the rest the first time.

sipa · 2016-04-20T14:22:33Z

@laanwj Yes, I've considered that. That could lead to an inconsistent block index though (as in: I'm not sure whether there are any internal assumptions that all entries in the block index are valid).

jonasschnelli · 2016-04-20T14:23:17Z

Nice. Concept ACK.
Testing this now on a mainnet node -reindex.

sipa · 2016-04-20T14:25:20Z

@laanwj Another problem with that approach: say you get a block corrupted near the chain tip on disk, you find it and do a reindex. You expect it to reindex up to that corrupted block and then start back from network. Instead, if you don't do transaction validation during the rebuild loop, it may happily add the block to the index, and have activation fail (making it mark the chain as invalid).

laanwj · 2016-04-20T14:27:15Z

Yes, it would be quite more complex to handle those issues well.
In any case if the speed-up is more than the extra disk i/o that this generates takes, it doesn't matter, no need to worry about it in this pull.

sipa · 2016-04-20T14:30:54Z

@jonasschnelli If you feel like benchmarking, an interesting thing to test is (reindex on master) vs (deleting chainstate on this PR), both with signature checks hackishly disabled. Deleting of chainstate caused a very slow, foreground, and (before #7821) uninterruptible process earlier. With this PR is should be (nearly) the same speed as an old reindex.

sipa · 2016-04-20T14:39:19Z

Would it be useful to have a -weakreindex or -rebuild now, which does not throw out the block index, and only the chainstate? That would be faster, as it wouldn't need the rebuild phase.

laanwj · 2016-04-20T15:37:03Z

Would it be useful to have a -weakreindex or -rebuild now, which does not throw out the block index, and only the chainstate? That would be faster, as it wouldn't need the rebuild phase.

Would need better reporting about what database is corrupted for that to be useful to end-users. But even without that I can see that being very useful for benchmarking the UTXO database.

I like the idea of being able to rebuild databases separately. What about the other way around? If you have a UTXO set, but want to reindex the block chain? (e.g. when switching a pruned node to a non-pruned one by copying block data from another node)

If it was up to me I wouldn't use the name 'weakindex' or 'rebuid' (that's another one for the rescan, reindex, re.... category of vaguenes) but call it what it is: -rebuild-blockindex -rebuild-chainstate etc (using the same name as the db directory names).

sipa · 2016-04-20T15:41:23Z

I like the idea of being able to rebuild databases separately. What about the other way around? If you have a UTXO set, but want to reindex the block chain? (e.g. when switching a pruned node to a non-pruned one by copying block data from another node)

You can just overwrite your blocks/ database with another version (including its index), as long as it has your currently active chain in it.

sipa · 2016-04-20T16:02:50Z

If it was up to me I wouldn't use the name 'weakindex' or 'rebuid' (that's another one for the rescan, reindex, re.... category of vaguenes) but call it what it is: -rebuild-blockindex -rebuild-chainstate etc (using the same name as the db directory names).

-rebuild-chainstate sounds like an excellent idea. I don't know what -rebuild-blockindex would mean though...

sipa · 2016-04-21T12:17:47Z

Added a commit to implement -reindex-chainstate.

I guess the semantics for -reindex-blockindex could be:

Rebuild the block index without rebuilding the chainstate
If afterwards the chainstate's tip is in the resulting block index, mark every block on the best chain as validated and pretend nothing happened
If not, wipe the chainstate and rebuild that as well.

laanwj · 2016-04-21T12:23:21Z

I don't know what -rebuild-blockindex would mean though...

That hypothetical option would rebuild the blockindex database, say after some new-fangled corruption check detects that that database was corrupted.

Throw away the block index (so the database in blocks/index)
Reindex blocks

But don't touch the chain state in any way, it is assumed to be still good. If the best chain is not included after reindexing the blocks, throw another error that a ~~full reindex~~ -reindex-chainstate is necessary.

I guess the semantics for -reindex-blockindex could be:

Yes, something like that, sounds good to me.

dcousens · 2016-04-22T00:03:29Z

concept ACK -(rebuild|reindex)-*

sipa · 2016-04-22T10:24:45Z

I tried implementing -reindex-blockindex, but it is significantly harder: it would mean first initializing with a dummy chainstate, then reindexing the blockindex while leaving the original chainstate untouched, then stopping the node, loading the real chainstate, marking the blocks along the best chain as valid (or fail if not found anymore), and starting the node again.

laanwj · 2016-04-26T13:09:34Z

Going to test this

laanwj · 2016-04-26T13:29:12Z

Not sure if this is expected behavior, it's probably harmless; but during the "Reindexing block file" initial scan, rev*.dat files are created with size 0:

-rw------- 1 user user         0 Apr 26 15:26 rev00415.dat
-rw------- 1 user user         0 Apr 26 15:26 rev00416.dat
-rw------- 1 user user         0 Apr 26 15:26 rev00417.dat
-rw------- 1 user user         0 Apr 26 15:26 rev00418.dat

Edit: the block counts are off? possibly due to out-of-order blocks not being counted. It looks a bit strange but probably harmless:

2016-04-26 13:30:48 Reindexing block file blk00471.dat...
2016-04-26 13:30:51 Loaded 3 blocks from external file in 2585ms

laanwj · 2016-04-26T17:58:28Z

Did a reindex of my test chain using this patch:

2016-04-26 13:07:36 Reindexing block file blk00000.dat...
2016-04-26 13:32:16 UpdateTip: new best=00000000839a8e6886ab5951d76f411475428afc90947ee320161bbf18eb6048 height=1 version=0x00000001 log2_work=33.000022 tx=2 date='2009-01-09 02:54:25' progress=0.000000 cache=0.0MiB(1tx)
2016-04-26 17:44:18 UpdateTip: new best=00000000000000000360e6d03330560950285d1062c6f6e6a4558ba1720d6e73 height=407838 version=0x00000004 log2_work=84.507935 tx=123156514 date='2016-04-18 11:34:09' progress=0.994753 cache=43.8MiB(23470tx)

Times:

phase1   00:24:40 (1480s)
phase2   04:12:02 (15122s)
         -----------------
total    04:36:42 (16602s)

For comparison, a sync from a local node on the same machine, under the same conditions, takes 05:32:29 (19949s). That's very good. I have not tried doing a 'traditional' reindex,

Edit: also verified that this fixes #7038. Also tested that interrupting the reindex then restarting works, both in the first phase and second phase.

Edit.2: also did a "traditional" reindex - it took 04:45:49 (17149s). Only a bit longer. Note that this is a fast CPU with 8 cores, so possibly the difference with not validating is less than it would otherwise be.

Edit.3: verified that utxo set hash is the same after old reindex versus new reindex (just to be sure).

laanwj · 2016-04-27T09:22:57Z

src/init.cpp

+    // scan for better chains in the block chain database, that are not yet connected in the active best chain
+    CValidationState state;
+    if (!ActivateBestChain(state, chainparams))
+        LogPrintf("Failed to connect best block");


Shouldn't this error be fatal?

jonasschnelli · 2016-04-28T09:05:03Z

Tested on a mainnet node:

jonasschnelli@bitcoinsrv:~/.bitcoin$ cat debug.log | grep "Bitcoin version\|Reindexing finished\|height=409149"
2016-04-27 14:11:51 Bitcoin version v0.12.99.0-fb682d7 (2016-04-27 15:34:26 +0200)
2016-04-27 14:34:17 Reindexing finished
2016-04-27 18:52:50 UpdateTip: new best=000000000000000000c2963b062bebef4beff8ef38ae5d81e7ccfa8e05abcb1a height=409149 version=0x00000004 log2_work=84.559763 tx=125119626 date='2016-04-27 14:05:06' progress=0.999875 cache=4.0MiB(0tx)

= Times:

phase1    1'346s (00:22:26)
phase2   15'513s (04:18:33)
===========================
total    16'859s (04:40:59)

Full debug log with -debug=reindex http://bitcoinsrv.jonasschnelli.ch/debug.reindex.log (>240MB!)

sipa · 2016-04-28T14:19:41Z

Added GUI progress reporting

sipa · 2016-05-01T21:22:18Z

The two phases in the GUI:

If the chainstate is ahead of the block index at startup, you get the same "Processing blocks on disk" now, while the main GUI is running (and not during loading).

pstratem · 2016-05-17T01:58:58Z

utACK fb8fad1

pstratem · 2016-05-17T03:00:35Z

utACK d3d7547

pstratem · 2016-05-17T03:08:36Z

very untested ACK b4d24e1 (sanity not guaranteed)

pstratem · 2016-05-17T03:29:25Z

On a related note access to chainActive immediately after threadGroup.create_thread(boost::bind(&ThreadImport, vImportFiles)); is not protected by cs_main

b4d24e1 Report reindexing progress in GUI (Pieter Wuille) d3d7547 Add -reindex-chainstate that does not rebuild block index (Pieter Wuille) fb8fad1 Optimize ActivateBestChain for long chains (Pieter Wuille) 316623f Switch reindexing to AcceptBlock in-loop and ActivateBestChain afterwards (Pieter Wuille) d253ec4 Make ProcessNewBlock dbp const and update comment (Pieter Wuille)

Revert this when bitcoin/bitcoin#7917 is merged.

b4d24e1 Report reindexing progress in GUI (Pieter Wuille) d3d7547 Add -reindex-chainstate that does not rebuild block index (Pieter Wuille) fb8fad1 Optimize ActivateBestChain for long chains (Pieter Wuille) 316623f Switch reindexing to AcceptBlock in-loop and ActivateBestChain afterwards (Pieter Wuille) d253ec4 Make ProcessNewBlock dbp const and update comment (Pieter Wuille)

…tChain + optimization 50dbec5 Add unit tests for signals generated by ProcessNewBlock() (furszy) f68251d Validation: rename one of the two instances using "bad-prevblk" to its correct description of "prevblk-not-found" (furszy) a51a755 Fix concurrency-related bugs in ActivateBestChain (Jesse Cohen) 198f435 Do not unlock cs_main in ABC unless we've actually made progress. (Matt Corallo) 8d15cf5 Optimize ActivateBestChain for long chains (Pieter Wuille) 8640be1 Fix fast-shutdown crash if genesis block was not loaded (Matt Corallo) ef24337 Hold cs_main while calling UpdatedBlockTip() and ui.NotifyBlockTip (Jesse Cohen) f9d2ab3 Update ValidationInterface() documentation to explicitly specify threading and memory model (Jesse Cohen) cb9bb25 Update documentation for SingleThreadedSchedulerClient() to specify the memory model (Jesse Cohen) 4ea2048 Add Unit Test for SingleThreadedSchedulerClient (Jesse Cohen) Pull request description: Made some back ports and adaptations to validate further the work introduced in #2203 and #2209. Now that the scheduler thread is processing most of the chain events, the callbacks execution order must be verified to dispatch the events serially, ensuring a consistent memory state in each event processing invocation. There are some concurrency issues solved, as well a an optimization for larger chains for the ABC workflow. Each commit details them well. List: bitcoin#7917 (only fb8fad1) bitcoin#12988 bitcoin#13023 bitcoin#13247 bitcoin#13835 This is built on top of #2284 (and the reason that made me notice that problem). So, 2284 comes first, then this one. ACKs for top commit: Fuzzbawls: ACK 50dbec5 random-zebra: ACK 50dbec5 and merging... Tree-SHA512: 573144cc96d2caa49ed2f0887dc8c53b5aca0efd54b6a25626063ccb780da426f56b3c6b7491648e7abefc1960c82b1354a4ff2c51425bfe99adaa4394562608

zcash: cherry picked from commit fb8fad1 zcash: bitcoin/bitcoin#7917

Optimise reindex Backports bitcoin/bitcoin#7917: > Several changes: > > * Make reindex/import use `AcceptBlock` rather than `ProcessNewBlock`, > so activation of the resulting best chain is delayed. > > * That delayed activation is handled by the existing "Activating best > chain" step in the startup process, which is moved to `ThreadImport`. > > * Optimize `ActivateBestChain` to not walk the entire chain to find the > most work chain to connect for each block (O(n^2) -> O(n)). > > This has several advantages: > > * As the actual activation is done after all blocks have been added > back to the index, it gets to use the checkpoints information (better > progress indicator, skipping of signature checks, ...). > > * Having a block index with many unactivated blocks (for example, after > deleting the chainstate directory) becomes _much_ faster > > * That case also runs in the background now (as it's simply treated as > part of the importing process). > > Downsides: > > * All blocks are read twice from disk during reindex (once to rebuild > the index, once to activate).

rebroad · 2021-09-07T11:48:23Z

src/qt/clientmodel.cpp

@@ -24,6 +24,7 @@
 class CBlockIndex;

 static const int64_t nClientStartupTime = GetTime();
+static int64_t nLastHeaderTipUpdateNotification = 0;


This variable seems to be initialized here, but there's nowhere in the code where it's ever updated, and yet it is checked as if the value might change.

Please don't use more than 5 year old pull requests to ask beginner programming questions.

You can open a new issue, if you are absolutely unable to find the answer yourself.

In this case it is updated via a reference (a reference is like a pointer).

laanwj added the Validation label Apr 20, 2016

This was referenced Apr 20, 2016

Reindex is slow due to signature checks. #7875

Closed

"Activating best chain" can take very long #7038

Closed

sipa force-pushed the betterreindex branch from 0bc37fd to 2a5b94e Compare April 20, 2016 14:36

sipa force-pushed the betterreindex branch from 2a5b94e to 6fb178d Compare April 21, 2016 12:14

sipa force-pushed the betterreindex branch 2 times, most recently from 2bd05b3 to 691cbc9 Compare April 21, 2016 12:30

laanwj reviewed Apr 27, 2016
View reviewed changes

sipa force-pushed the betterreindex branch from c88270e to b8aeee2 Compare April 28, 2016 17:47

sipa force-pushed the betterreindex branch from b8aeee2 to a0c0058 Compare May 1, 2016 21:52

pstratem mentioned this pull request May 17, 2016

Acquire lock to check for genesis block. #8063

Merged

laanwj merged commit b4d24e1 into bitcoin:master May 18, 2016

maflcko mentioned this pull request May 22, 2016

TODO for release notes 0.13.0 #7678

Closed

16 tasks

laanwj mentioned this pull request Jun 6, 2016

Initialization reports "Loading banlist..." while doing something else #8117

Closed

sipa mentioned this pull request Jun 22, 2016

Sync does not continue after reindex until a block is inved #8239

Closed

laanwj mentioned this pull request Jun 23, 2016

Reindex seems really slow (at least with default dbcache) #8245

Closed

maflcko mentioned this pull request Jul 2, 2016

"Activating best chain..." issue is back (implied fixed in #7038) #8299

Closed

sipa mentioned this pull request Dec 6, 2016

Remove unused CDiskBlockPos* argument from ProcessNewBlock #9273

Merged

dgenr8 mentioned this pull request Jan 15, 2017

Optimize reindex bitcoinxt/bitcoinxt#186

Merged

str4d added a commit to str4d/zcash that referenced this pull request Feb 23, 2017

Remove reference to -reindex-chainstate

f42cde3

Revert this when bitcoin/bitcoin#7917 is merged.

OlegGirko mentioned this pull request Jul 9, 2017

Backport Bitcoin PR#7917: Optimize reindex dashpay/dash#1515

Merged

1 task

furszy mentioned this pull request Feb 14, 2020

[WIP] Optimize reindex process PIVX-Project/PIVX#1334

Closed

random-zebra mentioned this pull request Sep 23, 2020

[Core] Optimize reindex PIVX-Project/PIVX#1864

Closed

furszy mentioned this pull request Apr 2, 2021

[Validation][Test] Fix concurrency issues in ActivateBestChain + optimization PIVX-Project/PIVX#2290

Merged

LarryRuane pushed a commit to LarryRuane/zcash that referenced this pull request May 27, 2021

Optimize ActivateBestChain for long chains

d68a558

zcash: cherry picked from commit fb8fad1 zcash: bitcoin/bitcoin#7917

str4d mentioned this pull request Aug 9, 2021

Optimise reindex zcash/zcash#5268

Merged

rebroad reviewed Sep 7, 2021

View reviewed changes

bitcoin locked as resolved and limited conversation to collaborators Sep 7, 2021

Optimize reindex #7917

Optimize reindex #7917

Uh oh!

Conversation

sipa commented Apr 20, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

laanwj commented Apr 20, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sipa commented Apr 20, 2016

Uh oh!

jonasschnelli commented Apr 20, 2016

Uh oh!

sipa commented Apr 20, 2016

Uh oh!

laanwj commented Apr 20, 2016

Uh oh!

sipa commented Apr 20, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sipa commented Apr 20, 2016

Uh oh!

laanwj commented Apr 20, 2016

Uh oh!

sipa commented Apr 20, 2016

Uh oh!

sipa commented Apr 20, 2016

Uh oh!

sipa commented Apr 21, 2016

Uh oh!

laanwj commented Apr 21, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dcousens commented Apr 22, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sipa commented Apr 22, 2016

Uh oh!

laanwj commented Apr 26, 2016

Uh oh!

laanwj commented Apr 26, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

laanwj commented Apr 26, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

laanwj Apr 27, 2016

Choose a reason for hiding this comment

Uh oh!

jonasschnelli commented Apr 28, 2016

Uh oh!

sipa commented Apr 28, 2016

Uh oh!

sipa commented May 1, 2016

Uh oh!

pstratem commented May 17, 2016

Uh oh!

pstratem commented May 17, 2016

Uh oh!

pstratem commented May 17, 2016

Uh oh!

pstratem commented May 17, 2016

Uh oh!

rebroad Sep 7, 2021

Choose a reason for hiding this comment

Uh oh!

maflcko Sep 7, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sipa commented Apr 20, 2016 •

edited

Loading

laanwj commented Apr 20, 2016 •

edited

Loading

sipa commented Apr 20, 2016 •

edited

Loading

laanwj commented Apr 21, 2016 •

edited

Loading

dcousens commented Apr 22, 2016 •

edited

Loading

laanwj commented Apr 26, 2016 •

edited

Loading

laanwj commented Apr 26, 2016 •

edited

Loading