-
Notifications
You must be signed in to change notification settings - Fork 37.8k
index: reset indexes when running reindex-chainstate #24630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
69251bf
to
35e669b
Compare
This resets the index db for -reindex-chainstate as it is done for -reindex. Before, a reindex-chainstate would have led to a index corruption of coinstatsindex, and to a duplication of the flatfile data in blockfilterindex.
35e669b
to
cf531ba
Compare
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Unsure of this solution. Might be better to fix the indexes to tolerate it? (Or perhaps the calling code to pay more attention to where the indexes are at.) (On another note, it might be nice to have a -reset-index=XYZ or such) |
I agree that this seems like a more robust solution (as long as it doesn't result in lots of code to cover edge-cases in the indexes). I could imagine |
I can see the point that a corruption in the chainstate db shouldn't make it necessary to rebuild a block-based index. Do you think that this should be specific to |
I think |
Does it really? I always thought it just rebuilds the blocks/index database plus the *rev files, but does not alter the *blk files. |
You wouldn't run |
Reviewing the base index code, I observe:
So IMO, we should move forward with this PR, but also figure out why the existing rewind logic is malfunctioning... |
I think it's because of the order in init:
|
It's good to do something about this, but it seems like an extreme workaround to wipe index data when the "-reindex-chainstate" option is used. Nothing about "-reindex-chainstate" name or documentation suggests that it will wipe other indexes. Other more mild workarounds might be:
Probably an ideal fix could be entirely contained in the BaseIndex class. It could just avoid doing any work and not consider any any index to be synced until the first BlockConnected notification arrives which is not an ancestor of the index's known best block. This behavior could be conditional on "-reindex-chainstate" or it could just be generic behavior. Indexes already do have some current behavior that is a little lazy like this. For example, indexes currently don't handle BlockDisconnected events, but instead wait for the next BlockConnected event before they rewind. |
Good point, actually it could also be mentioned in the "-reindex" help that the indexes data is wiped: ("Rebuild chain state and block index from the blk*.dat files on disk"). So "-reindex" is currently "-reindex-everything" (including the Block Index database). Option 1) should be trivial to implement, but the warning could be missed by the user who then wonders why the index isn't running.
The challenge I see here, as well as for option 2), is that the reindexing happens in the loadblk thread which can run well until after the node has started. So we don't only need to ignore I'll try to make a milder option work, putting the PR into draft until then. |
I think we just need bitcoin/src/node/chainstate.cpp Line 41 in ecf692b
which is different than the bitcoin/src/node/chainstate.cpp Lines 84 to 87 in ecf692b
to determine whether blockConnected block is ancestor of the index locator or previous coins tip. But these things are more for long term fixes. For a short term fix I think just disabling other indexes when -reindex-chainstate is used or just showing a startup error that says indexes can't be enabled while -reindex-chainstate is enabled would already be improvements over status quo if status quo is to corrupt the indexes. (Would suggest opening a new PR if you implement a new approach, too, so older comments here can be easily understood, and so different approaches can be compared) |
See #24789 for an alternative that just disallows the interaction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
utACK cf531ba531ceb78a94b911fdb8f8d5bfea455380
-----BEGIN PGP SIGNATURE-----
iQQzBAEBCAAdFiEE5GOpP18xF+7ebHMWvQKUJCH0iJ8FAmJXLXAACgkQvQKUJCH0
iJ+cMSAAvRiY+px/GUzY65GoVHoJFjKdNRl4xI1t1DR/SxlnddA5ql66XjtNERya
NbQSVn59yG2mUZl2CF1sFh5sbU1fZiBoHFDy00TpmMLz8xHNGi51ZE0SiXj/0x3j
KsKhpmqMgqtARocbVMVdkj1QxUOA7KDqcTjtlzLzLtZJzLq5Dbp65bqLfJliE3R+
gKgItkyllOr97XCdwpELMzVCBxb8B/PMea4nxNCHaLB26Pq590FVMRveR+s0JGZ8
JfcPr2kTIp7LpVSHHS6nU1oHVVBmIZGrBnZYogm2Z0hPcLiHIKDiYHbNAqq/REiF
KGlglsDUctvIddewWmm0nag8NytrLjaAdPk0wU1oFrc1qt4ymt/r1X0LAQaf8Vgo
DxKV8Ao7E7JOp0fgioErlcBkISaZt5u3Dtn6Fs523yH8eq2byDAsCNqU9O8ptPg1
gzY0YWV/mGLe4xe7b/Qd0k2aV4c422DXLdq2pY+Aqonfhq9CtDIuUDX78+tHeB0m
gdSQZCmbETaS1csDvpCdq0s1R4IYi23g0UF+7AeZmNEQnNhOCXPCM3+mJ84dXYti
wn4LBEn1AoMnPGiBDVLU7djoKqfv1DkuCpHXrR0usVFls3JkTQOWbzyVq4B1yFX8
qWA/krZYd9smmGXMDWPQ5LEA0DWvpa+pQ0p1Qm9IwvDJr+NHr5sXLQWYh801xoSF
/2yoHV58JS2ge5uAWY9jtGXfitGzD7/PHOicce0JY+sLI1B5lMLYCK5ydZCUpatV
wEZ0sNk6NQs4RvUKfNR09AQ/NMaN7Vy4frQebMs44FZN4lYHxZc+WVjhrfhiah3j
CVcQhyX18/c4GnTrgAZuhko8TDtpp+Ji3i2Uko29z3PZcRKpX0zwP8iNY0e/pfWv
BaITWz/qUJQ2Q5SSWO2wNDjOQpDB5XTVX0lfirmH0Z0iJBpKCEcFh2i9TBDypuQZ
SFkCkIGwcwRxDvnpp7Si9Lt+QqGOxR7AajfyjSmG98x30/xHPuUW604k4PZ/QBsE
YYU+FM0Dut+Sp85kbgiMltstdatq+DzvIlc2zPHeBRVR3+f+EZdb1jIzbeB4CccI
p321OcKbVNPR5UktLVmUiVupc1/qlIzi09Xkz9cES46sTIe6OUQzwylb8MoiEt5/
wKU2VEz/OmdU9OY5cZkAOv8KDAx8xjhi94ptWfeD8NyzD3Rw6fPclW2Uvq+S+G0Y
SCPRWrlhYVm05+E6ZoG8h+FuBwDtiwVkq+bFY6dwPzTm5u+Ofwj0gL6vFk3glErI
xvDDiWWSt4OBadTCx7IOFegJ8SoxMqS1pHcrTdDc1D9ty/Ew/41crCLNsagEuKBY
RrdoRwbywN4/RVToVPfhSkDnu9AFfQ==
=/qjs
-----END PGP SIGNATURE-----
I do think #24789 is a better alternative to this PR in it's current form (cf531ba), because it currently it clears and rebuilds all the indexes if |
Closed in favor of #24789 |
…nstate dac44fc init: disallow reindex-chainstate with optional indexes (Martin Zumsande) 62e1428 doc: Add note that -reindex will rebuild optional indexes (Martin Zumsande) Pull request description: When started together with `-reindex-chainstate`, currently coinstatsindex gets corrupted and the blockfilterindex flatfiles duplicated. See the OP of #24630 for more a more detailed explanation on why this happens. This is an alternative to #24630 which does not wipe and rebuild the indexes but returns an `InitError` when they are activated, thus requiring the user to deactivate them temporarily until the `-reindex-chainstate` run is finished. This also disallows `-reindex-chainstate` in combination with `-txindex`, which is not leading to corruption, but currently still rebuilds the index unnecessarily and unexpectedly. As a long-term goal, it would be desirable to have the indexes tolerate `reindex-chainstate` by ignoring their `BlockConnected` notifications (there is discussion in #24630 about this) or possibly move `reindex-chainstate` option into a `bitcoin-chainstate` executable, which could also solve the problem. But these would be larger projects - until then, it might be better to disallow the interaction than having corrupted indexes. The first commit adjusts the `-reindex` doc to mention that this option does rebuild all active indexes. ACKs for top commit: ryanofsky: Code review ACK dac44fc. Just fixed IsArgSet call and edited error messages since last review Tree-SHA512: c1abf7d350648ae227c3fd6c95d9a54c3bac9de70915275dea1c87cca6d9a76a056c0e306d95ef8cfe4df1f8525b418e0e7a4f52ded3be464041c0dc297f8930
This resets the index db for -reindex-chainstate as it is done for -reindex. Before, a reindex-chainstate would have led to a index corruption of coinstatsindex, and to a duplication of the flatfile data in blockfilterindex. Github-Pull: bitcoin#24630 Rebased-From: cf531ba
This resets the index dbs when running
-reindex-chainstate
as was previously done only for-reindex
.It fixes two bugs with
-reindex-chainstate
in conjunction with the indices:coinstatsindex
gets corrupted:reindex-chainstate
leads toBlockConnected()
signals to the index for each block that is reprocessed:If the index db is not reset, the running hash
DB_MUHASH
is not reset either and it will include data from each block twice after the reindexing, so the MuHash will be incorrect. I added a test for this that fails on master.blockfilterindex
duplicates the flatfile data:It has a variable
DB_FILTER_POS
for the current position which would not get reset. As a result, the filterdata would be written twice to disk in succession.txindex
has no running variable like the other two indexes, but I think that conceptually it makes sense to reset the index db here as well.