Skip to content

Conversation

jamesob
Copy link
Contributor

@jamesob jamesob commented Jan 7, 2022

This is part of the assumeutxo project (parent PR: #27596)


This PR includes the changes necessary to perform network functionality with multiple chainstates in use. Various pieces of net_processing logic have to be modified in order to support block download that is simultaneous across numerous chainstates.

Changes include

  • Modify FindNextBlocksToDownload() to parameterize the chainstate
    being worked on.

  • Change GetNodeStateStats to take the max nCommonHeight per peer across
    all chainstates.

  • Add CNodeState::chainstate_to_last_common_block

    • we need this to allow handling for a single peer to distinguish
      between separate chainstates we're simultaneously downloading blocks for
  • Share requests_available across chainstates when finding the next blocks
    to download (during calls to FindNextBlocksToDownload()).


This PR shares commit jamesob@17906dd with #24006, and is included here so that the two changes can be reviewed in parallel.

This PR excludes a small net_processing commit, jamesob@3e6164d, which will be proposed for merge after #24006 since it relies on the introduction of the BackgroundBlockConnected() validationinterface event that the indexing changes introduce.


Some commits here are best reviewed with --ignore-space-change.

Unit-testing net_processing is notoriously difficult and with that in mind I haven't included any unittests here, but in parallel with review of these changes I will attempt to write some tests. Note that this behavior is covered in the functional tests included in the parent PR: jamesob@c4949f2

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 7, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK fjahr
Concept ACK naumenkogs, ariard, mzumsande
Approach ACK jonatack
Stale ACK ryanofsky

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #27626 (Parallel compact block downloads, take 3 by instagibbs)
  • #27596 (assumeutxo (2) by jamesob)
  • #26762 (refactor: Make CCheckQueue RAII-styled by hebasto)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkpoint review feedback for the first two commits.

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK 0ef0057 the logic seems good AFAICS

@@ -3184,7 +3234,7 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
}

LOCK(cs_main);
if (m_chainman.ActiveChainstate().IsInitialBlockDownload() && !pfrom.HasPermission(NetPermissionFlags::Download)) {
if (m_chainman.IsAnyChainInIBD() && !pfrom.HasPermission(NetPermissionFlags::Download)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c4d9016 see #24178 that changes this logic

@naumenkogs
Copy link
Member

ACK modulo nits by me and jonatack

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Code review ACK c4d9016

@@ -4921,6 +4930,21 @@ bool ChainstateManager::PopulateAndValidateSnapshot(
return true;
}

CChainState& ChainstateManager::GetChainstateForNewBlock(const uint256& blockhash)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "validation: introduce ChainstateManager::GetChainstateForNewBlock" (75912c5)

Note: In my own words I'd describe the logic as: Always prefer to add blocks to the newer snapshot chainstate, not the older background chainstate, but add blocks to the background chainstate if they're already in in the snapshot chainstate.

Copy link
Contributor

@fjahr fjahr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK

Only did a light code review pass for now because there are already a lot of open comments to be addressed, I will come back when that is done.

@jamesob
Copy link
Contributor Author

jamesob commented Apr 28, 2022

Thanks for all the feedback so far; I'll get around to fixing this up in the next day or so after I get #24232 in order.

@jamesob jamesob force-pushed the 2022-01-au-netprocessing branch from c4d9016 to 2c7595a Compare April 29, 2022 21:58
Copy link
Contributor Author

@jamesob jamesob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased, thanks for the feedback all.

@sdaftuar
Copy link
Member

sdaftuar commented May 9, 2023

@sdaftuar did you have Current Thoughts about this approach? Trying to bridge IRC discussions to the blocking PR

I was trying to come up with my own proof-of-concept to demonstrate an alternate approach before commenting, but just to get everyone up to speed on what I'm thinking:

  • The first commit "validation: introduce ChainstateManager::GetChainstateForNewBlock" seems like something we don't conceptually need. I think the underlying issue is that AcceptBlock should be chainstate-agnostic -- the decision to store a block on disk isn't fundamentally based on a chain tip or the utxos we have. (The reason it comes into play in the current logic is as a heuristic for anti-DoS checks.) I have a branch that refactors AcceptBlock and some related code into ChainstateManager to show what I mean. master...sdaftuar:bitcoin:2023-04-blockstorage
  • I think we can avoid parametrizing data in net_processing by chainstate, and thus avoid storing chainstate-pointers or making the existing download logic more complex by having it support two different download strategies at the same time. My thought is that a lot of the complexity in FindNextBlocksToDownload comes from being able to download blocks towards any tip that our peer has which has more work than our own. However, in the background sync case, we are only interested in download blocks towards the snapshot base block hash. So all we need to know is whether our peer's tip contains the base block hash, and if it does, we just walk from our background chainstate tip towards that base block and look for blocks that we haven't requested yet.

I don't have a branch to share yet that implements that second bullet, but I'm hoping to get to it soon.

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than my remarks I'm happy with c7cdc7c.


if (state->pindexBestKnownBlock == nullptr
|| state->pindexBestKnownBlock->nChainWork < our_tip->nChainWork
|| state->pindexBestKnownBlock->nChainWork < m_chainman.MinimumChainWork()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add a TODO here to relax this for background validation. And/or explain that we expect a node with less than the minimum chainwork to still be in IBD and not willing to serve us blocks? Or that it's too much hassle to figure out which blocks they have in common with us (e.g. it's a fork-coin)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, would we relax this for background validation? I don't really see why. Recall that "our_tip" is referring to the tip of the chainstate we're working on (could be background) and not our network tip.

Copy link
Member

@Sjors Sjors May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only meant the comparison to MinimumChainWork(). Though in practice the assume UTXO height and minimum chainwork are probably not that far apart.

jamesob and others added 5 commits May 9, 2023 15:08
and use it in ProcessNewBlock().

This is needed for an upcoming change to `net_processing`.
Unrelated to assumeutxo changes, but while I was in the
neighborhood...
Most easily reviewed with `--ignore-space-change`

- Modify FindNextBlocksToDownload() to parameterize the chainstate
  being worked on.

- Change GetNodeStateStats to take the max nCommonHeight per peer across
  all chainstates.

- Add CNodeState::chainstate_to_last_common_block
  * we need this to allow handling for a single peer to distinguish
    between separate chainstates we're simultaneously downloading blocks for

- Share `requests_available` across chainstates when finding the next blocks
  to download (call to `FindNextBlocksToDownload()`).
Until we have completed IBD on all chainstates, don't answer getheaders
requests from our peers or advertise ourselves. This preserves the
existing behavior of not performing these actions until IBD of the
whole chain completes.
For all remaining usages of IsInitialBlockDownload() where
the status of any background validation chainstate that might
be in use isn't relevant.
@jamesob jamesob force-pushed the 2022-01-au-netprocessing branch from c7cdc7c to ac9adf0 Compare May 9, 2023 19:20
Copy link
Contributor Author

@jamesob jamesob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've udpated this PR to remove the new ChainstateManager::GetAllForBlockDownload(), instead adding a parameter assumed_first to the existing GetAll(). I think this is clearer at the net_processing call site and avoids another unnecessary method.


if (state->pindexBestKnownBlock == nullptr
|| state->pindexBestKnownBlock->nChainWork < our_tip->nChainWork
|| state->pindexBestKnownBlock->nChainWork < m_chainman.MinimumChainWork()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, would we relax this for background validation? I don't really see why. Recall that "our_tip" is referring to the tip of the chainstate we're working on (could be background) and not our network tip.

//
// Note that searching the snapshot chain (as below) implicitly searches the ibd
// chain, since the former includes the latter.
if (m_snapshot_chainstate &&
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor Author

@jamesob jamesob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this PR to remove the new ChainstateManager::GetAllForBlockDownload(), instead adding a parameter assumed_first to the existing GetAll(). I think this is clearer at the net_processing call site and avoids another unnecessary method.

@jamesob
Copy link
Contributor Author

jamesob commented May 9, 2023

Also, as far as I understand it, AssumeUTXO will mess up the order of the blocks stored on disk - the first blockfiles would have the blocks between the AssumeUTXO block and the tip, and the subsequent blockfiles would have the blocks between Genesis and the AssumeUTXO block. I wonder if that could break or slow down something, have you tried a reindex for example?

It's worth noting that there's a proposed remedy for this in #27596 - see b29f39e. cc @mzumsande

Copy link
Contributor

@fjahr fjahr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK ac9adf0

This is alright, I am just not loving the new assumed_first parameter to be honest. But curious to hear what you say @jamesob . If you change it, I promise to re-review it swiftly.

AssertLockHeld(::cs_main);
// Early return to avoid unnecessary blockindex lookup when assumeutxo is not in use.
if (!m_snapshot_chainstate) {
return *Assert(m_ibd_chainstate);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somewhat nit: Why not use ActiveChainstate() here? From my understanding, it doesn't functionally make a difference but it makes this a cleaner review since the result of this is replacing ActiveChainstate().

{
LOCK(::cs_main);
std::vector<Chainstate*> out;
const auto chainstates = assumed_first ?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter/api seems kind of unnecessary. Do we actually need/want assumed_first=false anywhere explicitly? Otherwise, I would suggest always returning assumed first and adding a comment that this specific ordering is important.

Alternatively, if it is better to have a default behavior of assumed_first=false then I would still find it simpler to reverse the order in the one place where it's needed rather than adding the param.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fairly easy to refactor later.

@@ -5224,9 +5224,10 @@ void PeerManagerImpl::MaybeSendAddr(CNode& node, Peer& peer, std::chrono::micros
if (!peer.m_addr_relay_enabled) return;

LOCK(peer.m_addr_send_times_mutex);
const bool is_ibd{WITH_LOCK(::cs_main, return m_chainman.IsAnyChainInIBD())};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would prefer in_ibd here

@jamesob
Copy link
Contributor Author

jamesob commented May 16, 2023

Thanks for the review! I'm done with nits on this one, and assumeutxo in general.

// It is more important to get to the network's tip
// quickly than do the background validation on the snapshot.
for (const auto* const chainstate : m_chainman.GetAll(/*assumed_first=*/true)) {
if (can_serve &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

058fecd: could still move if (can_serve out of the loop to reduce brackets.

{
LOCK(::cs_main);
std::vector<Chainstate*> out;
const auto chainstates = assumed_first ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fairly easy to refactor later.

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with the improvements in ac9adf0.

I wrote:

The first commit tends to confuses me

Could use some thoughts on that comment (from anyone really).

@sdaftuar wrote:

The first commit "validation: introduce ChainstateManager::GetChainstateForNewBlock" seems like something we don't conceptually need.

I'm pretty close to wrapping my head around it, so would be OK with merging it once I do and then refactor it. Avoiding the complexity in the first place sounds nice too, though someone else may need to take over the rebase baton from @jamesob in that case.

Also, as far as I understand it, AssumeUTXO will mess up the order of the blocks stored on disk

I'm fine dealing with that in later PRs.

@fjahr
Copy link
Contributor

fjahr commented May 16, 2023

I wrote:

The first commit tends to confuses me

Could use some thoughts on that comment (from anyone really).

@Sjors I found it hard to follow that message, to be honest :D Could you maybe try to formulate more clearly what is bothering you in particular and what questions you need to be answered? It might help if you reference the specific phases you are thinking of based on doc/design/assumeutxo.md. I think this is the most confusing part when discussing this.

@Sjors
Copy link
Member

Sjors commented May 17, 2023

I guess a simpler way to phrase it is: what can go wrong if c14ae13 picks the incorrect chainstate for a block? What are all the different ways in which we can receive a block? What do we expect to happen in each of these cases? Tests would be a really nice way to illustrate that, but are apparently a pain to write for net_processing.

@fjahr
Copy link
Contributor

fjahr commented May 17, 2023

I guess a simpler way to phrase it is: what can go wrong if c14ae13 picks the incorrect chainstate for a block? What are all the different ways in which we can receive a block? What do we expect to happen in each of these cases? Tests would be a really nice way to illustrate that, but are apparently a pain to write for net_processing.

@Sjors I have tried to create an overview of the different cases and added my understanding of what is happening, I hope this helps move the discussion along and clear up the confusion. Let me know if it helps or if I missed the point. Rather than copying it in here, it's probably better to keep it in the gist, so it can be updated and referenced easier and GitHub doesn't swallow it like the other comments. And I have thrown this together rather quickly so it's probably more likely that I have made a mistake in the overview rather than that the code is wrong if something seems weird :)

Here it is: https://gist.github.com/fjahr/4b90adf2c24b39b84d5e6f4822f6a14f

So let’s sink some ships… *insert boating accident joke*

The one thing that did not realize before is that syncing the ibd chain seems to work rather inefficiently unless I am missing something. From my understanding now (see A2, A3, A4) it seems that blocks would need to be received a second time in order to be included in ibd because they are first included in snapshot? I guess the naive way to fix this would be to look at the snapshot height and decide based on that which chainstate to use. But putting even more logic like this into net_processing seems wrong and makes me agree more with @sdaftuar that this may be better handled differently instead.

Scattered thoughts that are probably ok but might be considered for cleanup later:

  • What are the implications of the snapshot being "inactive" while loading per doc/design/assumeutxo.md (Row 2)? There isn't really a status like m_disabled for that. So I guess it's is just theoretically inactive but it can already be used. Maybe note that in assumeutxo.md.
  • Still hitting the disabled chain also seems kind of wasteful but it’s probably not a big deal and shouldn’t happen too often in normal operation (Row 5).
  • We do potentially save many blocks twice if we receive them multiple times. I don’t think this is a strong attack vector, but in the extreme case a node could have the full chain on disk twice, so use 2x the storage until it is restarted.

{
AssertLockHeld(::cs_main);
// Early return to avoid unnecessary blockindex lookup when assumeutxo is not in use.
if (!m_snapshot_chainstate) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In c14ae13, is there a reason for not using IsSnapshotActive() here?

@@ -1301,7 +1316,12 @@ void PeerManagerImpl::UpdateBlockAvailability(NodeId nodeid, const uint256 &hash
}
}

void PeerManagerImpl::FindNextBlocksToDownload(const Peer& peer, unsigned int count, std::vector<const CBlockIndex*>& vBlocks, NodeId& nodeStaller)
void PeerManagerImpl::FindNextBlocksToDownload(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 058fecd: just a suggestion but I'd change FindNextBlocksToDownload to return std::vector<const CBlockIndex*> instead of passing it in the parameters.

@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

@jamesob
Copy link
Contributor Author

jamesob commented May 26, 2023

Closing in lieu of some simpler changes that @sdaftuar has written, currently integrated in #27596.

@jamesob jamesob closed this May 26, 2023
@Sjors
Copy link
Member

Sjors commented May 27, 2023

@fjahr thanks for the write up! It's on my reading list. Will also look at @sdaftuar's PR.

@bitcoin bitcoin locked and limited conversation to collaborators Jun 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.