Skip to content

Conversation

ryanofsky
Copy link
Contributor

@ryanofsky ryanofsky commented Feb 1, 2022

This PR lets indexing code mostly run outside of the node process. It also improves indexing sync logic, which is moved out of indexing code to a new node::SyncChain() function.

Almost all the commits in this PR are small refactoring changes that move code from src/index/ to src/node/, or replace references to node types like CBlockIndex, CChain, CChainState in index code. There are only two commits affecting indexing sync logic which make complicated or substantive changes:

The commit messages have more details about these and other changes. Followups to this PR will reuse indexing sync code for wallets (#15719, #11756) and let indexes run in separate processes (#10102)

@ryanofsky ryanofsky changed the title indexes: Stop using node API and locking cs_main, improve sync logic indexes: Stop using node internal types and locking cs_main, improve sync logic Feb 1, 2022
@ryanofsky
Copy link
Contributor Author

Rebased 8d8cdcb -> 1fcfc73 (pr/indexy.1 -> pr/indexy.2, compare) due to conflict with #22932. Also fixed some bugs in the last commit 🙄

@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 2, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/24230.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK fjahr, mzumsande, jamesob, aureleoules, josibake
Approach ACK TheCharlatan

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #30750 (scripted-diff: LogPrint -> LogDebug by maflcko)
  • #30664 (build: Remove Autotools-based build system by hebasto)
  • #30635 (rpc: add optional blockhash to waitfornewblock by Sjors)
  • #30546 (util: Use consteval checked format string in FatalErrorf by maflcko)
  • #30469 (index: Fix coinstats overflow and introduce index versioning by fjahr)
  • #30409 (Introduce waitTipChanged() mining interface, replace RPCNotifyBlockChange, drop CRPCSignals & g_best_block by Sjors)
  • #29770 (index: Check all necessary block data is available before starting to sync by fjahr)
  • #29652 (wallet: Avoid potentially writing incorrect best block locator by ryanofsky)
  • #29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)
  • #29432 (Stratum v2 Template Provider (take 3) by Sjors)
  • #26966 (index: initial sync speedup, parallelize process by furszy)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@ryanofsky
Copy link
Contributor Author

Updated 1fcfc73 -> c5be433 (pr/indexy.2 -> pr/indexy.3, compare) fixing asan error https://cirrus-ci.com/task/5137882733084672 and lint error https://cirrus-ci.com/task/5208251477262336. Did not look into tsan failure https://cirrus-ci.com/task/4574932779663360 and multiprocess blockfilter_index_tests segfault https://cirrus-ci.com/task/4856407756374016, but I think they may be addressed by the asan fix

@DrahtBot DrahtBot mentioned this pull request Feb 2, 2022
18 tasks
@ryanofsky
Copy link
Contributor Author

Updated c5be433 -> 3ab1ebe (pr/indexy.3 -> pr/indexy.4, compare) fixing blockfilter_index_initial_sync crash that I could not produce locally but I believe was causing CI failures https://cirrus-ci.com/task/6670649857933312, https://cirrus-ci.com/task/5755856183623680, and https://cirrus-ci.com/task/4981799997669376

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions and comments while reviewing the first 8 commits.

My brain got friend while trying to understand 15398b1, even with --color-moved and probably because the original logic is also a bit confusing. So I'll try that again later.

Copy link
Contributor Author

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated 3ab1ebe -> d27072d (pr/indexy.4 -> pr/indexy.5, compare) with various suggested changes and a few other tweaks, all minor

My brain got friend while trying to understand 15398b1, even with --color-moved and probably because the original logic is also a bit confusing. So I'll try that again later.

This is good feedback. I split the commit up into two parts, so the boilerplate code moving part is separate from the other changes which are a little less obvious.

@ryanofsky
Copy link
Contributor Author

Thanks for the review!

In attachChain, avoid locking cs_main while calling prepare_sync method to run
index CustomInit code.

This is implemented by unconditionally spawning index sync threads, instead of
only spawning them conditionally when the index best block is not the chain
tip. An alternative implementation could avoid the cost of spawning the sync
thread by calling prepare_sync from the validationinterface notification
thread. But this would be slightly more complex, and also potentially slower if
any current or future index takes longer to initialize, since it would
initialize serially instead of in parallel with other index and node init code.
This commit does not change behavior in any way.
…ng code

This commit does not change behavior in any way.
Replace last m_blockman.UpdatePruneLock call with
interfaces::Chain::updatePruneLock call.
@josibake
Copy link
Member

josibake commented Aug 7, 2024

Coming back to this and added it to my list of things to review. Is there an upstream PR this one depends on that I should be reviewing first, or is there another reason the PR is in draft?

@ryanofsky
Copy link
Contributor Author

Coming back to this and added it to my list of things to review. Is there an upstream PR this one depends on that I should be reviewing first, or is there another reason the PR is in draft?

Thanks for being interested in this! The main reason it is in draft i just that it is too big and and I need to work on a way to split it up. For example the first commits and documentation and removing racy -reindex-chainstate behavior should be a separate PR. And I think I want to make a separate PR squashing down all the refactoring commits here into a single commit that is can be easier to review for correctness, even if it harder to see equivalence.

Another more urgent reason this is in draft is that tests broke after a recent very conflicted rebase, and I need to debug and go through the major fixes in that rebase: #29671 #29767 #29867, and make sure they are intact and working.

Not sure when I will have a chance to do these things, but I can definitely prioritize if there is interest in seeing this go forward. It would also be useful to know if particular parts of this PR are interesting or more appealing so I can split those off first.

@josibake
Copy link
Member

josibake commented Aug 7, 2024

Not sure when I will have a chance to do these things, but I can definitely prioritize if there is interest in seeing this go forward

Definitely interest from my side, but not super urgent. For context, I'm going to be focusing on reviewing kernel and multiprocess PRs going forward, so I'm generally interested in anything related to those topics. I had noticed some discussion on other indexing related PRs regarding the approach in this PR vs other approaches, which is what led me here.

It would also be useful to know if particular parts of this PR are interesting or more appealing so I can split those off first

What you described regarding the race condition and refactor splits makes sense to me. If you find time to fix the CI failures, I'll take a closer look then and likely have some better feedback about which parts could be potentially split out into smaller chunks.

@@ -79,7 +79,7 @@ bool TxIndex::FindTx(const uint256& tx_hash, uint256& block_hash, CTransactionRe
return false;
}

AutoFile file{m_chainstate->m_blockman.OpenBlockFile(postx, true)};
AutoFile file{m_blockman.OpenBlockFile(postx, true)};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the entire FindTx function could just be moved into the block manager. Then there is no need to pass it around here and it can instead be called through the interface just like elsewhere.

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 2, 2024

🐙 This pull request conflicts with the target branch and needs rebase.

PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Oct 24, 2024
…e reversed

c0bf667 index: add [nodiscard] attribute to functions writing to the db (furszy)
eef5955 index: coinstats reorg, fail when block cannot be reversed (furszy)

Pull request description:

  Found it while reviewing bitcoin#24230 (comment).

  During a reorg, continuing execution when a block cannot be reversed leaves the
  coinstats index in an inconsistent state.
  This was surely overlooked when 'CustomRewind' was implemented.

ACKs for top commit:
  ryanofsky:
    Code review ACK c0bf667. Only change since last review is new commit adding [[nodiscard]]

Tree-SHA512: f4fc8522508d23e4fff09a29c935971819b1bd3b2a260e08e2e2b72f9340980d74fbec742a58fe216baf61d27de057c7c8300e8fa075f8507cd1227f128af909
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Oct 24, 2024
…e reversed

c0bf667 index: add [nodiscard] attribute to functions writing to the db (furszy)
eef5955 index: coinstats reorg, fail when block cannot be reversed (furszy)

Pull request description:

  Found it while reviewing bitcoin#24230 (comment).

  During a reorg, continuing execution when a block cannot be reversed leaves the
  coinstats index in an inconsistent state.
  This was surely overlooked when 'CustomRewind' was implemented.

ACKs for top commit:
  ryanofsky:
    Code review ACK c0bf667. Only change since last review is new commit adding [[nodiscard]]

Tree-SHA512: f4fc8522508d23e4fff09a29c935971819b1bd3b2a260e08e2e2b72f9340980d74fbec742a58fe216baf61d27de057c7c8300e8fa075f8507cd1227f128af909
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Oct 24, 2024
…e reversed

c0bf667 index: add [nodiscard] attribute to functions writing to the db (furszy)
eef5955 index: coinstats reorg, fail when block cannot be reversed (furszy)

Pull request description:

  Found it while reviewing bitcoin#24230 (comment).

  During a reorg, continuing execution when a block cannot be reversed leaves the
  coinstats index in an inconsistent state.
  This was surely overlooked when 'CustomRewind' was implemented.

ACKs for top commit:
  ryanofsky:
    Code review ACK c0bf667. Only change since last review is new commit adding [[nodiscard]]

Tree-SHA512: f4fc8522508d23e4fff09a29c935971819b1bd3b2a260e08e2e2b72f9340980d74fbec742a58fe216baf61d27de057c7c8300e8fa075f8507cd1227f128af909
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Oct 24, 2024
…e reversed

c0bf667 index: add [nodiscard] attribute to functions writing to the db (furszy)
eef5955 index: coinstats reorg, fail when block cannot be reversed (furszy)

Pull request description:

  Found it while reviewing bitcoin#24230 (comment).

  During a reorg, continuing execution when a block cannot be reversed leaves the
  coinstats index in an inconsistent state.
  This was surely overlooked when 'CustomRewind' was implemented.

ACKs for top commit:
  ryanofsky:
    Code review ACK c0bf667. Only change since last review is new commit adding [[nodiscard]]

Tree-SHA512: f4fc8522508d23e4fff09a29c935971819b1bd3b2a260e08e2e2b72f9340980d74fbec742a58fe216baf61d27de057c7c8300e8fa075f8507cd1227f128af909
@DrahtBot
Copy link
Contributor

⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?

  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

2 similar comments
@DrahtBot
Copy link
Contributor

⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?

  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

@DrahtBot
Copy link
Contributor

⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?

  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

Comment on lines 338 to +339
// Update cached header
m_last_header = *Assert(ReadFilterHeader(new_tip.height, new_tip.hash));
m_last_header = *Assert(ReadFilterHeader(block.height, block.hash));
Copy link
Member

@furszy furszy Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In b3e2e19:

This doesn't seem to be correct. The last cached header is being set to the block we just disconnected. Also, this change should be in a previous commit.

No need to fix it here, I'm cherry-picking part of this work into another branch to push it as a separate PR anyway. But should be something like:

m_last_header = *Assert(ReadFilterHeader(block.height - 1, *Assert(block.prev_hash)));

achow101 added a commit that referenced this pull request Jun 12, 2025
029ba1a index: remove CBlockIndex access from CustomAppend() (furszy)
91b7ab6 refactor: index, simplify CopyHeightIndexToHashIndex to process single block (furszy)
6f1392c indexes, refactor: Remove remaining CBlockIndex* uses in index Rewind methods (Ryan Ofsky)
0a24870 indexes, refactor: Stop requiring CBlockIndex type to call IsBIP30Unspendable (Ryan Ofsky)
331a25c test: indexes, avoid creating threads when sync runs synchronously (furszy)

Pull request description:

  Combining common refactors from #24230 and #26966, aiming to move both efforts forward while reducing their size and review burden.

  Broadly, #24230 focuses on enabling indexes to run in a separate process, and #26966 aims to parallelize the indexes initial synchronization process. A shared prerequisite for both is ensuring that only the base index class interacts with the node’s chain internals - child index classes should instead operate solely through chain events.

  This PR moves disk read lookups from child index classes to the base index class. It also includes a few documentation improvements and a test-only code cleanup.

ACKs for top commit:
  maflcko:
    review ACK 029ba1a 👡
  achow101:
    ACK 029ba1a
  TheCharlatan:
    Re-ACK 029ba1a
  davidgumberg:
    ACK 029ba1a
  mzumsande:
    Code Review ACK 029ba1a

Tree-SHA512: f073af407fc86f228cb47a32c7bcf2241551cc89ff32059317eb81d5b86fd5fda35f228d2567e0aedbc9fd6826291f5fee05619db35ba44108421ae04d11e6fb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.