p2p: Advance pindexLastCommonBlock early after connecting blocks #32180

mzumsande · 2025-03-31T22:40:42Z

As described in #32179, pindexLastCommonBlock is updated later than necessary
in master.
In case of a linear chain with no forks, it can be moved forward at the beginning of
FindNextBlocksToDownload, so that the updated value can be used to better estimate nWindowEnd.
This helps the node to request all blocks from peers within the correct 1024-block-window and avoids peers being incorrectly marked as stallers.

The second commit removes a special case introduced in 49d569c which is not necessary, because the more general logic also covers this case.

I also changed p2p_ibd_stalling.py to cover the situation after a resolved situation, making sure that no additional peers are marked for stalling.

Fixes #32179

DrahtBot · 2025-03-31T22:40:46Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32180.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	furszy, stringintech
Concept ACK	yuvicc
Stale ACK	vasild

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

stringintech

Concept ACK.

stringintech · 2025-04-10T09:59:55Z

src/net_processing.cpp

+    // If our tip has advanced beyond pindexLastCommonBlock, move it ahead to the tip. We don't need to download any blocks in between, and skipping ahead here
+    // allows us to determine nWindowEnd better.
+    if (m_chainman.ActiveHeight() > state->pindexLastCommonBlock->nHeight && state->pindexBestKnownBlock->GetAncestor(m_chainman.ActiveHeight()) == m_chainman.ActiveTip()) {
+        state->pindexLastCommonBlock = m_chainman.ActiveTip();
+    }
+
    // If the peer reorganized, our previous pindexLastCommonBlock may not be an ancestor
    // of its current tip anymore. Go back enough to fix that.
    state->pindexLastCommonBlock = LastCommonAncestor(state->pindexLastCommonBlock, state->pindexBestKnownBlock);


Could there be an issue if both nodes reorganize to the same chain, but the last common block is still on another fork?

If our new tip height is less than the outdated pindexLastCommonBlock height (like when reorging to a shorter chain with more work), we'd skip the optimization when we shouldn't.

Since LastCommonAncestor() brings back the pindexLastCommonBlock on the peer's best chain, maybe the new code should come after this call?

Not sure if this edge case matters in practice, but thought I'd mention it.

Good observation!
I also think that this edge case shouldn't matter much.
It's absolutely essential that pindexLastCommonBlock is an ancestor of pindexBestKnownBlock when calling FindNextBlocks, which is why I originally left that check last. But the added check can't change that since it only sets pindexLastCommonBlock to the tip if that tip is an ancestor of pindexBestKnownBlock.
So I've changed the order.

yuvicc · 2025-04-11T07:47:23Z

Concept ACK

Thanks for this PR.

I have made some notes regarding this PR and will add some guide for testing as well very soon.

vasild

ACK c69ee2d

The code changes look reasonable. I tested that p2p_ibd_stalling.py fails without commit 1d8504c p2p: During block download, adjust pindexLastCommonBlock right away. It failed 7 times and passed 1 time. Is it expected to pass sometimes?
I also checked that the code inside the modified ifs is executed during p2p_ibd_stalling.py.

vasild · 2025-04-25T10:36:02Z

src/net_processing.cpp

+    // If our tip has advanced beyond pindexLastCommonBlock, move it ahead to the tip. We don't need to download any blocks in between, and skipping ahead here
+    // allows us to determine nWindowEnd better.
+    if (m_chainman.ActiveHeight() > state->pindexLastCommonBlock->nHeight && state->pindexBestKnownBlock->GetAncestor(m_chainman.ActiveHeight()) == m_chainman.ActiveTip()) {
+        state->pindexLastCommonBlock = m_chainman.ActiveTip();
+    }


This change assumes their tip is higher than our tip. What if their tip is lower? Maybe worth advancing pindexLastCommonBlock in that case as well. Something like this:

// If our tip has advanced beyond pindexLastCommonBlock, move it ahead to the tip. We don't need to // download any blocks in between, and skipping ahead here allows us to determine nWindowEnd better. const auto our_tip{m_chainman.ActiveTip()}; const auto peer_best_known{state->pindexBestKnownBlock}; if (our_tip->nHeight > state->pindexLastCommonBlock->nHeight) { if (peer_best_known->nHeight >= our_tip->nHeight) { if (peer_best_known->GetAncestor(our_tip->nHeight) == our_tip) { state->pindexLastCommonBlock = our_tip; } } else { if (our_tip->GetAncestor(peer_best_known->nHeight) == peer_best_known) { state->pindexLastCommonBlock = peer_best_known; } // FindNextBlocks() would be a noop below, so return here. return; } }

Then we would have returned earlier because there are no blocks from this peer that would help us with our IBD:

bitcoin/src/net_processing.cpp

Lines 1384 to 1387 in 80e6ad9

if (state->pindexBestKnownBlock == nullptr || state->pindexBestKnownBlock->nChainWork < m_chainman.ActiveChain().Tip()->nChainWork || state->pindexBestKnownBlock->nChainWork < m_chainman.MinimumChainWork()) {

// This peer has nothing interesting.

return;

}

(second condition)

Does it make sense to advance pindexLastCommonBlock in that case? I am just curious, it is definitely not the aim of this PR.

I'd say probably not because pindexLastCommonBlock isn't really used for anything but assigning blocks during IBD.
If we will never need any blocks from this peer (until it sends us more headers, but then pindexLastCommonBlock will get updated. I'll also ignore weird invalidateblock scenarios here), I don't see what benefits advancing it anyway would have.

vasild · 2025-04-25T11:57:39Z

test/functional/p2p_ibd_stalling.py

        stall_block = blocks[0].sha256
+        second_stall = blocks[500].sha256  # another block we don't provide immediately


nit: could use an array here:

stall_blocks = [blocks[0].sha256, blocks[500].sha256]

which would make it possible to have the following code like:

- peers.append(node.add_outbound_p2p_connection(P2PStaller([stall_block, second_stall]), p2p_idx=id, connection_type="outbound-full-relay")) + peers.append(node.add_outbound_p2p_connection(P2PStaller(stall_blocks), p2p_idx=id, connection_type="outbound-full-relay")) - self.wait_until(lambda: sum(len(peer['inflight']) for peer in node.getpeerinfo()) == 2) + self.wait_until(lambda: sum(len(peer['inflight']) for peer in node.getpeerinfo()) == len(stall_blocks))

vasild · 2025-04-25T12:00:10Z

test/functional/p2p_ibd_stalling.py

-        self.log.info("Check that all outstanding blocks get connected")
-        self.wait_until(lambda: node.getblockcount() == NUM_BLOCKS)
+        self.log.info("Check that all outstanding blocks up to the second stall block get connected")
+        self.wait_until(lambda: node.getblockcount() == 500)


nit: could use a variable second_stall_block_index = 500 or something like that to avoid repeating the number 500.

changed to use one var for each index, plus one array.

mzumsande · 2025-04-28T20:02:43Z

c69ee2d to b7ff6a6: addressed feedback

Is it expected to pass sometimes?

It wasn't - but I have tracked this down now. It's a bit involved, I hope the following explanation makes sense:

The initial situation is that there are 2 remaining connected peers, the node has just synced blocks up to height 500, but pindexLastCommonBlock is still at 0. The second stall block at height 501 is already requested from one peer, let's say peer0.

If SendMessages is invoked for peer0, it will request no blocks, update pindexLastCommonBlock to 500 and won't mark anyone as a staller, because it's already requested from the same peer.
If after that, peer1 would be processed, peer0 would be marked as a staller.
But if peer0 is processed twice in a row (which can happen due to the randomization in ThreadMessageHandler), we could request block 1025 from it.
If peer1 is processed then (for which pindexLastCommonBlock is still at 0), all 1024+1 blocks in the window will already be requested, so we never get to this point because we always continue here.
I think this could be fixed by exchanging the order of this and this code block. I'm not sure if this is something that needs fixing though because the current patch should avoid this problem too...

vasild

ACK b7ff6a6

But if peer0 is processed twice in a row (which can happen due to the randomization in ThreadMessageHandler), we could request block 1025 from it

I confirm this, made it reproducible:

git show 855b1fc2b9 |git apply -R
observe p2p_ibd_stalling.py is failing (mostly).
apply the patch below to mess with the order in which nodes are processed
observe p2p_ibd_stalling.py succeeding

[patch] change order in which nodes are processed

--- i/src/net.cpp
+++ w/src/net.cpp
@@ -3029,12 +3029,14 @@ void CConnman::OpenNetworkConnection(const CAddress& addrConnect, bool fCountFai
         pnode->ConnectedThroughNetwork(),
         GetNodeCount(ConnectionDirection::Out));
 }
 
 Mutex NetEventsInterface::g_msgproc_mutex;
 
+static size_t seq{0};
+
 void CConnman::ThreadMessageHandler()
 {
     LOCK(NetEventsInterface::g_msgproc_mutex);
 
     while (!flagInterruptMsgProc)
     {
@@ -3043,13 +3045,22 @@ void CConnman::ThreadMessageHandler()
         {
             // Randomize the order in which we process messages from/to our peers.
             // This prevents attacks in which an attacker exploits having multiple
             // consecutive connections in the m_nodes list.
             const NodesSnapshot snap{*this, /*shuffle=*/true};
 
-            for (CNode* pnode : snap.Nodes()) {
+            auto nodes = snap.Nodes();
+            std::ranges::sort(nodes, [](CNode* a, CNode* b) {
+                return seq % 3 == 0 ? a->GetId() > b->GetId() : a->GetId() < b->GetId();
+            });
+
+            ++seq;
+
+            LogInfo("===== begin");
+            for (CNode* pnode : nodes) {
+                LogInfo("===== node id: %d, disconnect: %d", pnode->GetId(), pnode->fDisconnect);
                 if (pnode->fDisconnect)
                     continue;
 
                 // Receive messages
                 bool fMoreNodeWork = m_msgproc->ProcessMessages(pnode, flagInterruptMsgProc);
                 fMoreWork |= (fMoreNodeWork && !pnode->fPauseSend);
@@ -3058,12 +3069,13 @@ void CConnman::ThreadMessageHandler()
                 // Send messages
                 m_msgproc->SendMessages(pnode);
 
                 if (flagInterruptMsgProc)
                     return;
             }
+            LogInfo("===== end");
         }
 
         WAIT_LOCK(mutexMsgProc, lock);
         if (!fMoreWork) {
             condMsgProc.wait_until(lock, std::chrono::steady_clock::now() + std::chrono::milliseconds(100), [this]() EXCLUSIVE_LOCKS_REQUIRED(mutexMsgProc) { return fMsgProcWake; });
         }

This allows us to calculate nWndowEnd better. Before, it would be calculated based on an old pindexLastCommonBlock, which could result in blocks not requested for download that could have been requested, and peers being wrongly marked as staller.

This reverts commit 49d569c which introduced extra logic for when a snapshot was loaded. With the previous commit, this is not longer necessary because the more general logic of advancing pindexLastCommonBlock also covers this case.

getpeerinfo provides a list of blocks that are inflight, which can be used instead.

This test (which would fail without the previous commit) checks that after the stalling block was received, we don't incorrectly mark another peer as a staller immediately.

mzumsande · 2025-07-21T15:27:17Z

b7ff6a6 to 3bdff9a: rebased due to conflict with #32868

furszy

ACK 3bdff9a

furszy · 2025-07-23T20:01:46Z

src/net_processing.cpp

@@ -1411,6 +1411,12 @@ void PeerManagerImpl::FindNextBlocksToDownload(const Peer& peer, unsigned int co
    if (state->pindexLastCommonBlock == state->pindexBestKnownBlock)
        return;

+    // If our tip has advanced beyond pindexLastCommonBlock, move it ahead to the tip. We don't need to download any blocks in between, and skipping ahead here
+    // allows us to determine nWindowEnd better.
+    if (m_chainman.ActiveHeight() > state->pindexLastCommonBlock->nHeight && state->pindexBestKnownBlock->GetAncestor(m_chainman.ActiveHeight()) == m_chainman.ActiveTip()) {


nano non-blocking nits:
Could extract tip = ActiveTip() and use it everywhere here to make it slightly more readable.

Also, this isn’t introduced by this PR, but the second check has always puzzled me a bit. It would be nice to have some util function like IsBlockInChain(const CBlockIndex* block_to_test, const CBlockIndex* chain_tip).

stringintech · 2025-07-30T10:21:54Z

ACK 3bdff9a

I think it would be nice to include how the newly added logic would also cover the assumeutxo case in the commit description; if my understanding is correct, adding something like:

After snapshot loading, our tip becomes the snapshot block. Since peers are verified to have the snapshot in their chain, our tip is an ancestor of the peer's best block, hence the general advancement logic will move pindexLastCommonBlock from any pre-snapshot position to the snapshot height automatically.

mzumsande · 2025-08-04T19:44:30Z

Thanks! I'll address the nits if / when I have to push again.

mzumsande marked this pull request as draft March 31, 2025 22:40

DrahtBot added the P2P label Mar 31, 2025

mzumsande mentioned this pull request Mar 31, 2025

p2p: Inefficiency in block download / stalling algorithm #32179

Open

mzumsande marked this pull request as ready for review April 2, 2025 21:15

mzumsande force-pushed the 202403_ibd_lastcommonblock branch from bb5aa0a to 205e651 Compare April 7, 2025 21:56

stringintech reviewed Apr 10, 2025

View reviewed changes

yuvicc mentioned this pull request Apr 11, 2025

Create #32180.md yuvicc/bitcoin-core-issues-prs-notes#4

Merged

mzumsande force-pushed the 202403_ibd_lastcommonblock branch from 205e651 to c69ee2d Compare April 11, 2025 19:46

vasild approved these changes Apr 25, 2025

View reviewed changes

DrahtBot requested a review from stringintech April 25, 2025 12:24

mzumsande force-pushed the 202403_ibd_lastcommonblock branch from c69ee2d to b7ff6a6 Compare April 28, 2025 19:56

DrahtBot added the CI failed label Apr 29, 2025

vasild approved these changes Apr 29, 2025

View reviewed changes

DrahtBot removed the CI failed label Apr 29, 2025

DrahtBot mentioned this pull request Jul 3, 2025

test: refactor: overhaul block hash determination for CBlock{,Header} objects #32868

Merged

DrahtBot added the Needs rebase label Jul 18, 2025

mzumsande added 4 commits July 21, 2025 11:15

p2p: remove extra logic for assumeutxo

e665560

This reverts commit 49d569c which introduced extra logic for when a snapshot was loaded. With the previous commit, this is not longer necessary because the more general logic of advancing pindexLastCommonBlock also covers this case.

test: remove magic number when checking for blocks that have arrived

f57e144

getpeerinfo provides a list of blocks that are inflight, which can be used instead.

test: improve coverage for a resolved stalling situation

3bdff9a

This test (which would fail without the previous commit) checks that after the stalling block was received, we don't incorrectly mark another peer as a staller immediately.

mzumsande force-pushed the 202403_ibd_lastcommonblock branch from b7ff6a6 to 3bdff9a Compare July 21, 2025 15:26

DrahtBot added CI failed and removed Needs rebase CI failed labels Jul 21, 2025

furszy reviewed Jul 23, 2025

View reviewed changes

DrahtBot requested a review from vasild July 23, 2025 20:07

	if (state->pindexBestKnownBlock == nullptr \|\| state->pindexBestKnownBlock->nChainWork < m_chainman.ActiveChain().Tip()->nChainWork \|\| state->pindexBestKnownBlock->nChainWork < m_chainman.MinimumChainWork()) {
	// This peer has nothing interesting.
	return;
	}

		stall_block = blocks[0].sha256
		second_stall = blocks[500].sha256 # another block we don't provide immediately

p2p: Advance pindexLastCommonBlock early after connecting blocks #32180

Are you sure you want to change the base?

p2p: Advance pindexLastCommonBlock early after connecting blocks #32180

Uh oh!

Conversation

mzumsande commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage & Benchmarks

Reviews

Conflicts

Uh oh!

stringintech left a comment

Choose a reason for hiding this comment

Uh oh!

stringintech Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuvicc commented Apr 11, 2025

Uh oh!

vasild left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mzumsande Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mzumsande commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vasild left a comment

Choose a reason for hiding this comment

Uh oh!

mzumsande commented Jul 21, 2025

Uh oh!

furszy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stringintech commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mzumsande commented Aug 4, 2025

Uh oh!

Uh oh!

mzumsande commented Mar 31, 2025 •

edited

Loading

DrahtBot commented Mar 31, 2025 •

edited

Loading

stringintech Apr 10, 2025 •

edited

Loading

mzumsande Apr 28, 2025 •

edited

Loading

mzumsande commented Apr 28, 2025 •

edited

Loading

stringintech commented Jul 30, 2025 •

edited

Loading