Parallel compact block downloads, take 3 #27626

instagibbs · 2023-05-11T14:05:20Z

This is an attempt at mitigating #25258 , which is a revival of #10984, which is a revival of #9447.

This PR attempts to mitigate a single case, where high bandwidth peers can bail us out of a flakey
peer not completing blocks for us. We allow up to 2 additional getblocktxns requests per unique block.
This would hopefully allow the chance for an honest high bandwidth peer to hand us the transactions
even if the first in flight peer stalls out.

In contrast to previous effort:

it will not help if subsequent peers send block headers only, so only high-bandwidth peers this time. See: https://github.com/bitcoin/bitcoin/pull/10984/files#diff-6875de769e90cec84d2e8a9c1b962cdbcda44d870d42e4215827e599e11e90e3R1411
MAX_GETBLOCKTXN_TXN_AFTER_FIRST_IN_FLIGHT is removed, in favor of aiding recovery during turbulent mempools
We require one of the 3 block fetching slots to be an outbound peer. This can be the original offering peer, or subsequent compact blocks given by high bandwidth peers.

DrahtBot · 2023-05-11T14:05:23Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	sdaftuar, mzumsande
Concept ACK	jamesob, theuni
Stale ACK	dergoegge, ajtowns

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#27596 (assumeutxo (2) by jamesob)
#27052 (test: rpc: add last block announcement time to getpeerinfo result by LarryRuane)
#26969 (net, refactor: net_processing, add ProcessCompactBlockTxns by brunoerg)
#26812 (test: add end-to-end tests for CConnman and PeerManager by vasild)
#26621 (refactor: Continue moving application data from CNode to Peer by dergoegge)
#24008 (assumeutxo: net_processing changes by jamesob)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

ajtowns

Seems okay at first glance. It would be nice to have a "blockrequest.cpp" module that collects all this logic together. Perhaps we want to keep this change simple so it's a smaller backport, and refactor later, though.

src/net_processing.cpp

ajtowns · 2023-05-11T15:30:25Z

src/net_processing.h

+/** Greg: Setting to 3 to match number of high bandwidth peers... ~ensures one is outbound */
+static const unsigned int MAX_CMPCTBLOCKS_INFLIGHT_PER_BLOCK = 3;
+/** Maximum number of missing transactions for a non-first getblocktxn request */
+static const unsigned int MAX_GETBLOCKTXN_TXN_AFTER_FIRST_IN_FLIGHT = 10;


Out of the last 612 compact blocks my node's reconstructed that required requesting transactions, 277 needed 10 or fewer (45%), 441 needed 50 or fewer (72%), 488 needed 100 or fewer (80%), 548 needed 500 or fewer (90%). I'd probably prefer 50 or 100 than 10, I guess?

(In simpler times, 1086/1162 (93%) need 10 or fewer, 1140/1162 (98%) needed 50 or fewer, 1144/1162 (98.5%) needed 100 or fewer. Again, I'm not including blocks that requested 0 txs in those denominators)

what if we just removed the limit, since it's not much of a protection, especially if the missing 10 txns are *nscriptions

I think the limit's fine; when you start up with an empty mempool, you'll be requesting most of the transactions in the block because your mempool's empty, and there's no need to do that three times in parallel. Could perhaps have the limit work the other way: if you've already got N txs and need K txs, and 0 < K <= N/4 + 10; then if the block just has ten 100kvB txs, you won't request them multiple times.

Imo, it doesn't seem worthwhile to have the limit:

the good case (small K) should be far more common than the bad one (large K)

the limit makes relay of blocks with large K less reliable w.r.t to stalling (reduces to what we have now)

the limit adds code complexity and a maintenance burden. 10 is not a good value according to your data and if Allow 2 simultaneous (compact-)block downloads #10984 had been merged then we would need to re-evaluate now. Who's to say this won't change again.

if the issue is during warmup, that's a pretty small window, and typically won't actually do parallel fetching, 3 seems rare even when warmed up with limit of 10. I'll gather some more stats on that front, seeing how many parallel blocks are initialized.

This "warmup" should really only be an issue after we've seen new blocks from 3+ peers since we don't persist hb peers across restarts and we have to wait for MaybeSetPeerAsAnnouncingHeaderAndIDs to be called on multiple unique peers. I now suspect the warmup period handles itself? My logs show 4+ blocks per node with only one partial block.

src/net_processing.cpp

instagibbs · 2023-05-12T13:38:16Z

I'll clean up history later when we're closer to ACKs

Sjors · 2023-05-12T17:47:10Z

In contrast to previous effort, it will not help if subsequent peers send block headers only

Can you elaborate on this? Are you saying that falling back to full block download is pointless?

instagibbs · 2023-05-12T17:56:51Z

@Sjors Basically previous efforts would allow the peer to send a header, not a compact block, and we would fire off an additional getdata for the compact block. See this block of code: https://github.com/bitcoin/bitcoin/pull/10984/files#diff-6875de769e90cec84d2e8a9c1b962cdbcda44d870d42e4215827e599e11e90e3R1411

IMO it's unlikely to make an impact, as high-bandwidth peers are already demonstrably fast, and allows us to control better who takes up the additional slots.

Sjors · 2023-05-12T19:13:58Z

So that earlier attempt would solicit a compact block from a (BIP 152) low-bandwidth mode peer, in response to it announcing a header. That's similar to what we already do in HeadersDirectFetchBlocks, but allowing more than 1 in transit?

Whereas in this PR we focus on the high-bandwidth mode peers, which are allowed to send us a cmpctblock message straight away (rather than waiting for us to ask). Right now we can only have one cmpctblock -> getblocktxn -> blocktxn "session" in progress. This PR makes that 3.

This approach seems fine, but the other attempts were never merged, which is why I am confused by the "it will not help" phrase. Are you saying these earlier attempts failed in some experiment, are they useless in general or is this new approach just better (and less complex?)?

instagibbs · 2023-05-12T19:17:34Z

This PR makes that 3.

Makes it 3 parallel block downloads, first can be of any kind, the 2nd and 3rd must be compact block announcements from high badwidth peers.

I am confused by the "it will not help" phrase

This PR does not take advantage of low-bandwidth compact block peers message patterns, in other words.

instagibbs · 2023-05-15T15:13:36Z

I was getting some sybil-stalls above the missing txn <=10 limit, so for now uncapped it.

I also added in logic to ensure that at least one of the 3 slots is taken by an outbound peer. This should be more robust against sybils that are able to take up all inbound hb slots. e.g.,:

uses fresh connection to announce, then stall
uses both inbound hb peers to take up 2nd and 3rd slots, but never gives txns
fresh peer times out, gets disconnected
repeat 1-3

Added more extensive tests.

Github-Pull: bitcoin#27626 Rebased-From: 86cff8b

Github-Pull: bitcoin#27626 Rebased-From: a905954

Github-Pull: bitcoin#27626 Rebased-From: cce9618

…ight This is a change in behavior so that if for some reason we request a block from a peer, we don't allow an unsolicited CMPCT_BLOCK announcement for that same block to cause a request for a full block from the uninvited peer (as some type of request is already outstanding from the original peer) Github-Pull: bitcoin#27626 Rebased-From: 13f9b20

A single outbound slot is required, so if the first two slots are taken by inbound in-flights, the node will reject additional unless they are coming from outbound. This means in the case where a fast sybil peer is attempting to stall out a node, a single high bandwidth outbound peer can mitigate the attack. Github-Pull: bitcoin#27626 Rebased-From: 03423f8

Github-Pull: bitcoin#27626 Rebased-From: d7f359b

Github-Pull: bitcoin#27626 Rebased-From: 86cff8b

Github-Pull: bitcoin#27626 Rebased-From: a905954

Github-Pull: bitcoin#27626 Rebased-From: cce9618

…ight This is a change in behavior so that if for some reason we request a block from a peer, we don't allow an unsolicited CMPCT_BLOCK announcement for that same block to cause a request for a full block from the uninvited peer (as some type of request is already outstanding from the original peer) Github-Pull: bitcoin#27626 Rebased-From: 13f9b20

A single outbound slot is required, so if the first two slots are taken by inbound in-flights, the node will reject additional unless they are coming from outbound. This means in the case where a fast sybil peer is attempting to stall out a node, a single high bandwidth outbound peer can mitigate the attack. Github-Pull: bitcoin#27626 Rebased-From: 03423f8

Github-Pull: bitcoin#27626 Rebased-From: d7f359b

pinheadmz · 2023-06-02T15:00:46Z

@instagibbs does this close #21803 ?

Github-Pull: bitcoin#27626 Rebased-From: 86cff8b

Github-Pull: bitcoin#27626 Rebased-From: a905954

Github-Pull: bitcoin#27626 Rebased-From: cce9618

…ight This is a change in behavior so that if for some reason we request a block from a peer, we don't allow an unsolicited CMPCT_BLOCK announcement for that same block to cause a request for a full block from the uninvited peer (as some type of request is already outstanding from the original peer) Github-Pull: bitcoin#27626 Rebased-From: 13f9b20

A single outbound slot is required, so if the first two slots are taken by inbound in-flights, the node will reject additional unless they are coming from outbound. This means in the case where a fast sybil peer is attempting to stall out a node, a single high bandwidth outbound peer can mitigate the attack. Github-Pull: bitcoin#27626 Rebased-From: 03423f8

Github-Pull: bitcoin#27626 Rebased-From: d7f359b

b8ad322 Unconditionally return when compact block status == READ_STATUS_FAILED (Greg Sanders) cdd3de0 Add tests for parallel compact block downloads (Greg Sanders) e66a5cb Support up to 3 parallel compact block txn fetchings (Greg Sanders) d1a93f5 Only request full blocks from the peer we thought had the block in-flight (Greg Sanders) 38e3af9 Convert mapBlocksInFlight to a multimap (Greg Sanders) a45159b Remove nBlocksInFlight (Greg Sanders) 722361e alias BlockDownloadMap for mapBlocksInFlight (Greg Sanders) Pull request description: Backports: * #27626 * #27743 ACKs for top commit: instagibbs: utACK b8ad322 ajtowns: ACK b8ad322 ; confirmed patches are clean cherry-picks from master, and already tested patches prior to 25.0 release Tree-SHA512: 438901496a5ed927662e62f936e3d1e7ffb727cb235869854983e8e29a68e144eb3bff307d9fc3ae785fb276b67a216b1cce397689252ca49c5d761efc1380ac

instagibbs force-pushed the 2023-05-parallel-block-downloads branch from 1071c38 to 6736955 Compare May 11, 2023 14:10

DrahtBot added the CI failed label May 11, 2023

glozow added the P2P label May 11, 2023

glozow requested a review from sdaftuar May 11, 2023 14:23

jamesob added a commit to chaincodelabs/bmon that referenced this pull request May 11, 2023

Deploy instagibbs' bitcoin/bitcoin#27626

74c4a66

DrahtBot removed the CI failed label May 11, 2023

ajtowns reviewed May 11, 2023

View reviewed changes

dergoegge reviewed May 11, 2023

View reviewed changes

src/net_processing.cpp Outdated Show resolved Hide resolved

instagibbs force-pushed the 2023-05-parallel-block-downloads branch 2 times, most recently from 9400e69 to 1dc948e Compare May 11, 2023 16:57

DrahtBot added CI failed and removed CI failed labels May 11, 2023

instagibbs force-pushed the 2023-05-parallel-block-downloads branch 2 times, most recently from 724ff9e to ceded96 Compare May 11, 2023 18:55

dergoegge reviewed May 12, 2023

View reviewed changes

src/net_processing.cpp Outdated Show resolved Hide resolved

DrahtBot added the CI failed label May 12, 2023

DrahtBot mentioned this pull request May 12, 2023

test: add end-to-end tests for CConnman and PeerManager #26812

Open

instagibbs force-pushed the 2023-05-parallel-block-downloads branch from bbe260f to a9f9145 Compare May 15, 2023 15:18

Sjors pushed a commit to Sjors/bitcoin that referenced this pull request May 27, 2023

alias BlockDownloadMap for mapBlocksInFlight

f5bbc60

Github-Pull: bitcoin#27626 Rebased-From: 86cff8b

Sjors pushed a commit to Sjors/bitcoin that referenced this pull request May 27, 2023

Remove nBlocksInFlight

0a4da49

Github-Pull: bitcoin#27626 Rebased-From: a905954

Sjors pushed a commit to Sjors/bitcoin that referenced this pull request May 27, 2023

Convert mapBlocksInFlight to a multimap

5d1a6b2

Github-Pull: bitcoin#27626 Rebased-From: cce9618

Sjors pushed a commit to Sjors/bitcoin that referenced this pull request May 27, 2023

Add tests for parallel compact block downloads

55e9aa6

Github-Pull: bitcoin#27626 Rebased-From: d7f359b

fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Jun 2, 2023

alias BlockDownloadMap for mapBlocksInFlight

da52709

Github-Pull: bitcoin#27626 Rebased-From: 86cff8b

fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Jun 2, 2023

Remove nBlocksInFlight

ff01fc0

Github-Pull: bitcoin#27626 Rebased-From: a905954

fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Jun 2, 2023

Convert mapBlocksInFlight to a multimap

f1e276c

Github-Pull: bitcoin#27626 Rebased-From: cce9618

fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Jun 2, 2023

Add tests for parallel compact block downloads

ce969f4

Github-Pull: bitcoin#27626 Rebased-From: d7f359b

pinheadmz mentioned this pull request Jun 2, 2023

Processing of new blocks slower than necessary due to overly selective peer selection #21803

Closed

Sjors mentioned this pull request Jun 5, 2023

validation: log which peer sent us a header #27826

Merged

brunoerg mentioned this pull request Jun 5, 2023

ignored getblocktxn takes longer to detect than is possible #22295

Closed

fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Jun 16, 2023

alias BlockDownloadMap for mapBlocksInFlight

722361e

Github-Pull: bitcoin#27626 Rebased-From: 86cff8b

fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Jun 16, 2023

Remove nBlocksInFlight

a45159b

Github-Pull: bitcoin#27626 Rebased-From: a905954

fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Jun 16, 2023

Convert mapBlocksInFlight to a multimap

38e3af9

Github-Pull: bitcoin#27626 Rebased-From: cce9618

fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Jun 16, 2023

Add tests for parallel compact block downloads

cdd3de0

Github-Pull: bitcoin#27626 Rebased-From: d7f359b

Sjors mentioned this pull request Jul 2, 2023

Assertion failed: (data.size() > node.nSendOffset), function SocketSendData, file net.cpp, line 837 #27963

Closed

1 task

luke-jr mentioned this pull request Jul 9, 2023

Bugfix: net_processing: Restore "Already requested" error for FetchBlock #28055

Closed

0xB10C mentioned this pull request Jan 18, 2024

Timeout downloading block #12291

Open

mzumsande mentioned this pull request Jan 19, 2024

Post startup stalling #29281

Open

the-real-vortex-v mentioned this pull request Jan 27, 2024

0.22 One cpu core is at 100% and node shows not responding in the GUI and the RPC can take ages to respond pocketnetteam/pocketnet.core#658

Closed

mzumsande mentioned this pull request Mar 15, 2024

p2p: When close to the tip, download blocks in parallel from additional peers to prevent stalling #29664

Draft

bitcoin locked and limited conversation to collaborators Jun 1, 2024

Parallel compact block downloads, take 3 #27626

Parallel compact block downloads, take 3 #27626

Uh oh!

Conversation

instagibbs commented May 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented May 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Conflicts

Uh oh!

ajtowns left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ajtowns May 11, 2023

Choose a reason for hiding this comment

Uh oh!

instagibbs May 11, 2023

Choose a reason for hiding this comment

Uh oh!

ajtowns May 11, 2023

Choose a reason for hiding this comment

Uh oh!

dergoegge May 11, 2023

Choose a reason for hiding this comment

Uh oh!

instagibbs May 11, 2023

Choose a reason for hiding this comment

Uh oh!

instagibbs May 12, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

instagibbs commented May 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Sjors commented May 12, 2023

Uh oh!

instagibbs commented May 12, 2023

Uh oh!

Sjors commented May 12, 2023

Uh oh!

instagibbs commented May 12, 2023

Uh oh!

instagibbs commented May 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pinheadmz commented Jun 2, 2023

Uh oh!

Uh oh!

instagibbs commented May 11, 2023 •

edited

Loading

DrahtBot commented May 11, 2023 •

edited

Loading

instagibbs commented May 12, 2023 •

edited

Loading

instagibbs commented May 15, 2023 •

edited

Loading