-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Parallel compact block downloads, take 3 #27626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel compact block downloads, take 3 #27626
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
1071c38
to
6736955
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems okay at first glance. It would be nice to have a "blockrequest.cpp" module that collects all this logic together. Perhaps we want to keep this change simple so it's a smaller backport, and refactor later, though.
src/net_processing.h
Outdated
/** Greg: Setting to 3 to match number of high bandwidth peers... ~ensures one is outbound */ | ||
static const unsigned int MAX_CMPCTBLOCKS_INFLIGHT_PER_BLOCK = 3; | ||
/** Maximum number of missing transactions for a non-first getblocktxn request */ | ||
static const unsigned int MAX_GETBLOCKTXN_TXN_AFTER_FIRST_IN_FLIGHT = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of the last 612 compact blocks my node's reconstructed that required requesting transactions, 277 needed 10 or fewer (45%), 441 needed 50 or fewer (72%), 488 needed 100 or fewer (80%), 548 needed 500 or fewer (90%). I'd probably prefer 50 or 100 than 10, I guess?
(In simpler times, 1086/1162 (93%) need 10 or fewer, 1140/1162 (98%) needed 50 or fewer, 1144/1162 (98.5%) needed 100 or fewer. Again, I'm not including blocks that requested 0 txs in those denominators)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if we just removed the limit, since it's not much of a protection, especially if the missing 10 txns are *nscriptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the limit's fine; when you start up with an empty mempool, you'll be requesting most of the transactions in the block because your mempool's empty, and there's no need to do that three times in parallel. Could perhaps have the limit work the other way: if you've already got N txs and need K txs, and 0 < K <= N/4 + 10
; then if the block just has ten 100kvB txs, you won't request them multiple times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo, it doesn't seem worthwhile to have the limit:
- the good case (small K) should be far more common than the bad one (large K)
- the limit makes relay of blocks with large K less reliable w.r.t to stalling (reduces to what we have now)
- the limit adds code complexity and a maintenance burden. 10 is not a good value according to your data and if Allow 2 simultaneous (compact-)block downloads #10984 had been merged then we would need to re-evaluate now. Who's to say this won't change again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the issue is during warmup, that's a pretty small window, and typically won't actually do parallel fetching, 3 seems rare even when warmed up with limit of 10. I'll gather some more stats on that front, seeing how many parallel blocks are initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "warmup" should really only be an issue after we've seen new blocks from 3+ peers since we don't persist hb peers across restarts and we have to wait for MaybeSetPeerAsAnnouncingHeaderAndIDs
to be called on multiple unique peers. I now suspect the warmup period handles itself? My logs show 4+ blocks per node with only one partial block.
9400e69
to
1dc948e
Compare
724ff9e
to
ceded96
Compare
I'll clean up history later when we're closer to ACKs |
Can you elaborate on this? Are you saying that falling back to full block download is pointless? |
@Sjors Basically previous efforts would allow the peer to send a header, not a compact block, and we would fire off an additional IMO it's unlikely to make an impact, as high-bandwidth peers are already demonstrably fast, and allows us to control better who takes up the additional slots. |
So that earlier attempt would solicit a compact block from a (BIP 152) low-bandwidth mode peer, in response to it announcing a header. That's similar to what we already do in Whereas in this PR we focus on the high-bandwidth mode peers, which are allowed to send us a This approach seems fine, but the other attempts were never merged, which is why I am confused by the "it will not help" phrase. Are you saying these earlier attempts failed in some experiment, are they useless in general or is this new approach just better (and less complex?)? |
Makes it 3 parallel block downloads, first can be of any kind, the 2nd and 3rd must be compact block announcements from high badwidth peers.
This PR does not take advantage of low-bandwidth compact block peers message patterns, in other words. |
I was getting some sybil-stalls above the missing txn <=10 limit, so for now uncapped it. I also added in logic to ensure that at least one of the 3 slots is taken by an outbound peer. This should be more robust against sybils that are able to take up all inbound hb slots. e.g.,:
Added more extensive tests. |
bbe260f
to
a9f9145
Compare
Github-Pull: bitcoin#27626 Rebased-From: 86cff8b
Github-Pull: bitcoin#27626 Rebased-From: a905954
Github-Pull: bitcoin#27626 Rebased-From: cce9618
…ight This is a change in behavior so that if for some reason we request a block from a peer, we don't allow an unsolicited CMPCT_BLOCK announcement for that same block to cause a request for a full block from the uninvited peer (as some type of request is already outstanding from the original peer) Github-Pull: bitcoin#27626 Rebased-From: 13f9b20
A single outbound slot is required, so if the first two slots are taken by inbound in-flights, the node will reject additional unless they are coming from outbound. This means in the case where a fast sybil peer is attempting to stall out a node, a single high bandwidth outbound peer can mitigate the attack. Github-Pull: bitcoin#27626 Rebased-From: 03423f8
Github-Pull: bitcoin#27626 Rebased-From: d7f359b
Github-Pull: bitcoin#27626 Rebased-From: 86cff8b
Github-Pull: bitcoin#27626 Rebased-From: a905954
Github-Pull: bitcoin#27626 Rebased-From: cce9618
…ight This is a change in behavior so that if for some reason we request a block from a peer, we don't allow an unsolicited CMPCT_BLOCK announcement for that same block to cause a request for a full block from the uninvited peer (as some type of request is already outstanding from the original peer) Github-Pull: bitcoin#27626 Rebased-From: 13f9b20
A single outbound slot is required, so if the first two slots are taken by inbound in-flights, the node will reject additional unless they are coming from outbound. This means in the case where a fast sybil peer is attempting to stall out a node, a single high bandwidth outbound peer can mitigate the attack. Github-Pull: bitcoin#27626 Rebased-From: 03423f8
Github-Pull: bitcoin#27626 Rebased-From: d7f359b
@instagibbs does this close #21803 ? |
Github-Pull: bitcoin#27626 Rebased-From: 86cff8b
Github-Pull: bitcoin#27626 Rebased-From: a905954
Github-Pull: bitcoin#27626 Rebased-From: cce9618
…ight This is a change in behavior so that if for some reason we request a block from a peer, we don't allow an unsolicited CMPCT_BLOCK announcement for that same block to cause a request for a full block from the uninvited peer (as some type of request is already outstanding from the original peer) Github-Pull: bitcoin#27626 Rebased-From: 13f9b20
A single outbound slot is required, so if the first two slots are taken by inbound in-flights, the node will reject additional unless they are coming from outbound. This means in the case where a fast sybil peer is attempting to stall out a node, a single high bandwidth outbound peer can mitigate the attack. Github-Pull: bitcoin#27626 Rebased-From: 03423f8
Github-Pull: bitcoin#27626 Rebased-From: d7f359b
b8ad322 Unconditionally return when compact block status == READ_STATUS_FAILED (Greg Sanders) cdd3de0 Add tests for parallel compact block downloads (Greg Sanders) e66a5cb Support up to 3 parallel compact block txn fetchings (Greg Sanders) d1a93f5 Only request full blocks from the peer we thought had the block in-flight (Greg Sanders) 38e3af9 Convert mapBlocksInFlight to a multimap (Greg Sanders) a45159b Remove nBlocksInFlight (Greg Sanders) 722361e alias BlockDownloadMap for mapBlocksInFlight (Greg Sanders) Pull request description: Backports: * #27626 * #27743 ACKs for top commit: instagibbs: utACK b8ad322 ajtowns: ACK b8ad322 ; confirmed patches are clean cherry-picks from master, and already tested patches prior to 25.0 release Tree-SHA512: 438901496a5ed927662e62f936e3d1e7ffb727cb235869854983e8e29a68e144eb3bff307d9fc3ae785fb276b67a216b1cce397689252ca49c5d761efc1380ac
This is an attempt at mitigating #25258 , which is a revival of #10984, which is a revival of #9447.
This PR attempts to mitigate a single case, where high bandwidth peers can bail us out of a flakey
peer not completing blocks for us. We allow up to 2 additional getblocktxns requests per unique block.
This would hopefully allow the chance for an honest high bandwidth peer to hand us the transactions
even if the first in flight peer stalls out.
In contrast to previous effort:
MAX_GETBLOCKTXN_TXN_AFTER_FIRST_IN_FLIGHT
is removed, in favor of aiding recovery during turbulent mempools