-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Fix more init bugs. #10919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix more init bugs. #10919
Conversation
479efd7
to
818bb29
Compare
utACK 818bb29 Similar to #10758, I think its hard to reason about all the possible edge cases here, but this is a step in the right direction and I think on balance better code than not having it. |
This resolves a possible-assert-on-shutdown race introduced in 1f668b6 when early shutdown occurs. Previously this was not done to avoid any cases where the threadGroup might not exit due to a blocking thread, but at this point the threadGroup isn't used all that much, plus Qt already does this, and its good to keep their init/shutdown consistent. For those curious, the threadGroup is only used in a few places: * Its used to run the CCheckQueues in script validation, but these use the boost mutex/condition variable primitives, so they respect the interrupt pretty trivially. * Its used for the import thread, which should exit rather quickly as mostly it just calls LoadExternalBlockFile, which has an interruption_point right before each block loaded. * Its used in the scheduler thread, which is only used for: * validationinterface has an effectively-dummy reference to it. * wallet compaction, which should not last long * addr/banlist dumping from CConnman, which should also be fast
This more clearly uses fReindex vs fReset to make sure we're not clearing our coinsdb needlessly when restarting after a reindex. It also makes it so that restarting after shutting down mid-reindex isn't treates specially at all during txdb loading code, as it shouldn't be.
This fixes a few cases where we should be treating a restart-after- coinsviewdb-reset identically to a just-reset-coinsviewdb. Thanks to @morcos for identifying the bug.
818bb29
to
e7539f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Low-confidence utACK e7539f8. Changes make sense, though I'm not extremely familiar with this code, so I don't have a lot of confidence that they couldn't potentially cause problems.
// threadGroup.join_all(); was left out intentionally here, because we didn't re-test all of | ||
// the startup-failure cases to make sure they don't result in a hang due to some | ||
// thread-blocking-waiting-for-another-thread-during-startup case | ||
threadGroup.join_all(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Always wait for threadGroup to exit in bitcoind shutdown"
This is a nice change and seems safe given what you describe in your commit message. I guess we should look out for shutdown issues with #10286, though, since after #10286, the part about "validationinterface has an effectively-dummy reference to [CScheduler]" will no longer be true
@@ -1414,6 +1414,8 @@ bool AppInitMain(boost::thread_group& threadGroup, CScheduler& scheduler) | |||
// LoadBlockIndex will load fTxIndex from the db, or set it if | |||
// we're reindexing. It will also load fHavePruned if we've | |||
// ever removed a block file from disk. | |||
// Note that it also sets fReindex based on the disk flag! | |||
// From here on out fReindex and fReset mean something different! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Fix resume-of-reindex-after-restart"
Maybe make this more specific. You could say "From here on out fReindex may be true even if fReset is false (indicating the node was shut down mid-reindex)."
I'm not sure of the first commit: but utACK for the rest. |
utACK e7539f8 |
e7539f8 Fix some broken init-time prints/constants (Matt Corallo) 13ab353 Check for empty coinsview instead of just-reset coinsview in init (Matt Corallo) fce3f4f Fix resume-of-reindex-after-restart (Matt Corallo) efac91e Always wait for threadGroup to exit in bitcoind shutdown (Matt Corallo) Pull request description: This is a follow-on to #10758 to help move 10758 along. The first fixes a regression in master that was partially fixed in 10758, the second I'm not sure if its a regression or not, but its clearly a bug that should be fixed. Tree-SHA512: aca7b97a97dca66e1a218a33cc6f4aa002292ff1bb0af64e35b81fbaa91b9504f2605375808b43e93a63fc73634ad079b30ef6c9f4ba338d3b5f72d816dfeaff
861f9a2 Skip remainder of init if upgrade is cancelled (Matt Corallo) Pull request description: Based on #10919. Without this, if you cancel upgrade, you get a needless error: ERROR: VerifyDB(): *** irrecoverable inconsistency in block data at Tree-SHA512: aa47665682c6605ada376f1c100ce17cf8c4312427929eb2e75306f2199b47cbcdb4e0d98d5efcfefff03947b2c0fcbd3aab487a4ed14d50607df685c91a03d0
861f9a2 Skip remainder of init if upgrade is cancelled (Matt Corallo) Pull request description: Based on bitcoin#10919. Without this, if you cancel upgrade, you get a needless error: ERROR: VerifyDB(): *** irrecoverable inconsistency in block data at Tree-SHA512: aa47665682c6605ada376f1c100ce17cf8c4312427929eb2e75306f2199b47cbcdb4e0d98d5efcfefff03947b2c0fcbd3aab487a4ed14d50607df685c91a03d0
861f9a2 Skip remainder of init if upgrade is cancelled (Matt Corallo) Pull request description: Based on bitcoin#10919. Without this, if you cancel upgrade, you get a needless error: ERROR: VerifyDB(): *** irrecoverable inconsistency in block data at Tree-SHA512: aa47665682c6605ada376f1c100ce17cf8c4312427929eb2e75306f2199b47cbcdb4e0d98d5efcfefff03947b2c0fcbd3aab487a4ed14d50607df685c91a03d0
e7539f8 Fix some broken init-time prints/constants (Matt Corallo) 13ab353 Check for empty coinsview instead of just-reset coinsview in init (Matt Corallo) fce3f4f Fix resume-of-reindex-after-restart (Matt Corallo) efac91e Always wait for threadGroup to exit in bitcoind shutdown (Matt Corallo) Pull request description: This is a follow-on to bitcoin#10758 to help move 10758 along. The first fixes a regression in master that was partially fixed in 10758, the second I'm not sure if its a regression or not, but its clearly a bug that should be fixed. Tree-SHA512: aca7b97a97dca66e1a218a33cc6f4aa002292ff1bb0af64e35b81fbaa91b9504f2605375808b43e93a63fc73634ad079b30ef6c9f4ba338d3b5f72d816dfeaff
e7539f8 Fix some broken init-time prints/constants (Matt Corallo) 13ab353 Check for empty coinsview instead of just-reset coinsview in init (Matt Corallo) fce3f4f Fix resume-of-reindex-after-restart (Matt Corallo) efac91e Always wait for threadGroup to exit in bitcoind shutdown (Matt Corallo) Pull request description: This is a follow-on to bitcoin#10758 to help move 10758 along. The first fixes a regression in master that was partially fixed in 10758, the second I'm not sure if its a regression or not, but its clearly a bug that should be fixed. Tree-SHA512: aca7b97a97dca66e1a218a33cc6f4aa002292ff1bb0af64e35b81fbaa91b9504f2605375808b43e93a63fc73634ad079b30ef6c9f4ba338d3b5f72d816dfeaff
e7539f8 Fix some broken init-time prints/constants (Matt Corallo) 13ab353 Check for empty coinsview instead of just-reset coinsview in init (Matt Corallo) fce3f4f Fix resume-of-reindex-after-restart (Matt Corallo) efac91e Always wait for threadGroup to exit in bitcoind shutdown (Matt Corallo) Pull request description: This is a follow-on to bitcoin#10758 to help move 10758 along. The first fixes a regression in master that was partially fixed in 10758, the second I'm not sure if its a regression or not, but its clearly a bug that should be fixed. Tree-SHA512: aca7b97a97dca66e1a218a33cc6f4aa002292ff1bb0af64e35b81fbaa91b9504f2605375808b43e93a63fc73634ad079b30ef6c9f4ba338d3b5f72d816dfeaff
e7539f8 Fix some broken init-time prints/constants (Matt Corallo) 13ab353 Check for empty coinsview instead of just-reset coinsview in init (Matt Corallo) fce3f4f Fix resume-of-reindex-after-restart (Matt Corallo) efac91e Always wait for threadGroup to exit in bitcoind shutdown (Matt Corallo) Pull request description: This is a follow-on to bitcoin#10758 to help move 10758 along. The first fixes a regression in master that was partially fixed in 10758, the second I'm not sure if its a regression or not, but its clearly a bug that should be fixed. Tree-SHA512: aca7b97a97dca66e1a218a33cc6f4aa002292ff1bb0af64e35b81fbaa91b9504f2605375808b43e93a63fc73634ad079b30ef6c9f4ba338d3b5f72d816dfeaff
73d7288 [Doc] Mention reindex-chainstate in the release notes (random-zebra) c45c0ec Skip MoneySupply update during reindex/reindex-chainstate (random-zebra) efe193f Scan for better chains in ThreadImport (random-zebra) 0aceea3 Make ProcessNewBlock dbp const and update comment (Pieter Wuille) 8255e82 [Cleanup] Remove fVerifyingBlocks global flag (random-zebra) 30c7aed Add -reindex-chainstate that does not rebuild block index (random-zebra) 79d5cb2 Verify DB with original default check-level 3 (random-zebra) ed96859 Reduce default number of blocks to check at startup (Pieter Wuille) d4516fa Log/report in 10% steps during VerifyDB (Jonas Schnelli) cce2c9e doc: Mention dbcache increase in release notes (random-zebra) 11dc022 [GUI] Update default db cache size in the options model (random-zebra) 9b7f764 Bump `-dbcache` default to 300MiB (Wladimir J. van der Laan) Pull request description: This was attempted already in #1864 and #1907. Third time's a charm. Now the speed is acceptable and (unlike what was happening in #1907) the process can be interrupted. Further, as it is performed in ThreadImport, the GUI is open and functional during the chainstate reindex, instead of being seemingly stuck at the splash screen. This PR also includes a few improvements coming from bitcoin#10919, bitcoin#8273, bitcoin#8136 and bitcoin#8611. Here's a quick comparison of the data I had testing here (empty wallet with GUI, empty pivx.conf / default values). The reindexes have been performed with 0 network connections. <table> <tr> <th>Mainnet (height=2847011)</th><th>Master</th><th>this PR</th> </tr><tr> <td>online sync</td><td>8 hr 13 min</td><td>6 hr 42 min</td> </tr><tr> <td>-reindex (offline)</td><td>7 hr 14 min</td><td>6 hr 21 min</td> </tr><tr> <td>-reindex-chainstate (offline)</td><td>N/A</td><td>6 hr 13 min</td> </tr><tr> <td>startup (synced node)</td><td>100 sec</td><td>83 sec</td> </tr><tr> <td>RAM (synced node)</td><td>1.65 GiB</td><td>1.5 GiB</td> </tr><tr> <td>utxo_cache (synced node)</td><td>28.9MiB</td><td>39.5MiB</td> </tr> </table> ACKs for top commit: furszy: re-ACK 73d7288, tested again. Tree-SHA512: fb828d89692ccb6105eae5411b70a3c7a129d98d3101121cf604171de1b818a8c68b3771cc69fcf0a04642fe1327ee7ce360a72ecd7ef019b1614bd83c081de5
This is a follow-on to #10758 to help move 10758 along. The first fixes a regression in master that was partially fixed in 10758, the second I'm not sure if its a regression or not, but its clearly a bug that should be fixed.