-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Allow 2 simultaneous block downloads #9447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
33095f6
to
02da045
Compare
Seems to fail |
src/net_processing.cpp
Outdated
|
||
if (pindex->nStatus & BLOCK_HAVE_DATA) // Nothing to do here | ||
return true; | ||
|
||
LogPrintf("ChainWork check at height %d new: %s tip: %s\n",pindex->nHeight,pindex->nChainWork.GetHex(),chainActive.Tip()->nChainWork.GetHex()); | ||
if (pindex->nChainWork <= chainActive.Tip()->nChainWork || // We know something better |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not make this < instead of <= ? why do we avoid requesting compact blocks for competing best blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@morcos I couldn't see this added line in any of the 3 PRs you mentioned.
src/net_processing.cpp
Outdated
@@ -1893,24 +2022,41 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv, | |||
fBlockReconstructed = true; | |||
} | |||
} | |||
if (pindex->nHeight == 165) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the deal with block height 165?
src/net_processing.cpp
Outdated
map<uint256, pair<NodeId, list<QueuedBlock>::iterator> >::iterator it = mapBlocksInFlight.find(resp.blockhash); | ||
if (it == mapBlocksInFlight.end() || !it->second.second->partialBlock || | ||
it->second.first != pfrom->GetId()) { | ||
bool fExpectedBLOCKTXN = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very much approve of the variable namings chosen in various places.
src/net_processing.cpp
Outdated
// We seem to be rather well-synced, so it appears pfrom was the first to provide us | ||
// with this block! Let's get them to announce using compact blocks in the future. | ||
MaybeSetPeerAsAnnouncingHeaderAndIDs(nodestate, pfrom, connman); | ||
if (nodestate->fSupportsDesiredCmpctVersion && vGetData.size() == 1 && mmapBlocksInFlight.size() == mmapBlocksInFlight.count(vGetData[0].hash) && pindexLast->pprev->IsValid(BLOCK_VALID_CHAIN)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not make this <=2 rather than ==1 (for mapBlocksInFlight.size())? this way if two blocks get announced in close proximity we can request compact blocks for both of them (rather than compact block for the oldest, and full block for the most recent).
I like the coding style and elegance of this. Will help with testing. |
@rebroad As mentioned in the PR comment, this is built off 3 other PR's. All of your comments either belong on those PR's or are related to the last commit which was just for debugging the travis failure and will be removed. I have just left it there for now in case anyone else wants to see the error details. |
@morcos claimed on IRC the test failures might be (in part) due to the issue mentioned at #9375 (comment) |
@morcos although most of the lines I am commenting on have not been introduced by you, given you are changing the code near to them, I tought it was a good opportunity to suggest these changes as I believe they ought to be changed at some point, so perhaps could be with this PR. |
Rebased with the newest commit from #9375 which fixes the failure. For clarity only the 5 commits on which I'm the author are meant for review here. The others are contained in the linked PR's... |
OT: Imo it makes sense to octomerge all pulls which this pull depends on and rebase the commits of this pull on top of the merge commit. Thus, the original commit hashes are preserved and it is clear what was old and should be reviewed in other pulls. Also, it is easier to see the fresh commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-rebase utACK. I still think it might make sense to only allow a second if a HB peer is responsible for one of them, but I think that kind of change is desired is complementary on top of this.
src/net_processing.cpp
Outdated
if (fPeerWasFirstRequest) { | ||
// We requested this block, but its far into the future, so our | ||
// mempool will probably be useless - request the block normally | ||
// Only allow a the first peer to request a full block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Only ask for the full block from the first peer we requested from"?
state->nBlocksInFlightValidHeaders -= itInFlight->second.second->fValidatedHeaders; | ||
if (state->nBlocksInFlightValidHeaders == 0 && itInFlight->second.second->fValidatedHeaders) { | ||
// Last validated block on the queue was received. | ||
nPeersWithValidatedDownloads--; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could just do the -= itInFlight->second.second->fValidatedHeaders
as above to match.
src/net_processing.cpp
Outdated
BlockDownloadMap::iterator itInFlight = range.first; | ||
ClearDownloadState(itInFlight); | ||
range.first++; | ||
mmapBlocksInFlight.erase(itInFlight); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since C++11 can also just capture the return value and set as range.first
instead of worrying about iterator invalidation.
src/net_processing.cpp
Outdated
@@ -105,7 +105,8 @@ namespace { | |||
bool fValidatedHeaders; //!< Whether this block has validated headers at the time of request. | |||
std::unique_ptr<PartiallyDownloadedBlock> partialBlock; //!< Optional, used for CMPCTBLOCK downloads | |||
}; | |||
map<uint256, pair<NodeId, list<QueuedBlock>::iterator> > mapBlocksInFlight; | |||
typedef std::multimap<uint256, pair<NodeId, list<QueuedBlock>::iterator>> BlockDownloadMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think indexing might by NodeId
would be better than using a multimap, because it would ensure that there couldn't be multiple entries for a block from the same node. I'd suggest:
typedef map<pair<uint256, NodeId>, list<QueuedBlock>::iterator> BlockDownloadMap;
This also would let you replace some of the while loops added here with direct lookups. I thought some of these (especially the while loop with the break statement setting fExpectedBLOCKTXN
) were kind of confusing.
The downside of using a map is that you wouldn't have an equal_range
method to call in the places that do require a loop. But you could replace those equal_range calls with calls to an equivalent helper function:
pair<BlockDownloadMap::iterator, BlockDownloadMap::iterator>
GetBlockDownloadRange(BlockDownloadMap& blocks, const uint256& hash) {
return {blocks.lower_bound({hash, numeric_limits<NodeId>::min()}),
blocks.upper_bound({hash, numeric_limits<NodeId>::max()})};
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure about this change. In particular there are a lot of mmapBlocksInFlight.count(hash) calls that would get a bit less clear... I'll think about it some more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I noticed that in the later commits. You could have a count helper function returning std::distance(range.first, range.second). I do think a map is a better way to represent the data, but the c++ map implementation does makes it a little awkward. Anyway, it's just something to consider.
@@ -2292,6 +2292,18 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv, | |||
} | |||
pindexWalk = pindexWalk->pprev; | |||
} | |||
// Special case for second cmpctblock request of tip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you expand this comment a little bit to describe the condition being checked? In particular I don't understand how the IsWitnessEnabled and fHaveWitness parts relate to making the request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, maybe consider just pulling the MSG_CMPCT_BLOCK setting a bit later down up to here and just requesting the block in this if statement (possibly before the while loop above).
src/net_processing.cpp
Outdated
@@ -1947,7 +1947,7 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv, | |||
|
|||
if (pindex->nChainWork <= chainActive.Tip()->nChainWork || // We know something better | |||
pindex->nTx != 0) { // We had this block at some point, but pruned it | |||
if (fAlreadyInFlight) { | |||
if (fInFlightFromSamePeer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change ("Only request full blocks from the peer we thought had the block in-flight") a change in behavior? Or is it just a cleanup after the previous multimap commit? It seems like this commit should be merged into the preceding or following one, or the commit message should be extended to say what the effect is, what motivates it.
Rebased and I think done the way @MarcoFalke suggested. Addressed feedback |
0b76739
to
ca7c450
Compare
rebased |
ca7c450
to
d1c73b9
Compare
…ight This is a change in behavior so that if for some reason we request a block from a peer, we don't allow an unsolicited CMPCT_BLOCK announcement for that same block to cause a request for a full block from the uninvited peer (as some type of request is already outstanding from the original peer)
d1c73b9
to
5eb857d
Compare
rebased |
@@ -274,6 +274,41 @@ void InitializeNode(CNode *pnode, CConnman& connman) { | |||
PushNodeVersion(pnode, connman, GetTime()); | |||
} | |||
|
|||
// Requires cs_main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you AssertLockHeld?
state->nStallingSince = 0; | ||
} | ||
|
||
// Requires cs_main. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you AssertLockHeld?
@@ -2292,6 +2292,18 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv, | |||
} | |||
pindexWalk = pindexWalk->pprev; | |||
} | |||
// Special case for second cmpctblock request of tip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, maybe consider just pulling the MSG_CMPCT_BLOCK setting a bit later down up to here and just requesting the block in this if statement (possibly before the while loop above).
mmapBlocksInFlight.size() == mmapBlocksInFlight.count(pindexLast->GetBlockHash()) && | ||
mmapBlocksInFlight.count(pindexLast->GetBlockHash()) < MAX_CMPCTBLOCKS_INFLIGHT_PER_BLOCK && | ||
!(pindexLast->nStatus & BLOCK_HAVE_DATA) && | ||
(!IsWitnessEnabled(pindexLast->pprev, chainparams.GetConsensus()) || State(pfrom->GetId())->fHaveWitness) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the strange case that a peer does not set the fHaveWItness service bit, but does announce compact blocks v2, I believe this line would result in a full block request. More generally, because the two if statements always have to be in sync to avoid this, I really prefer we pull the actual request logic into this if statement.
src/net_processing.cpp
Outdated
@@ -2072,8 +2086,7 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr | |||
// We want to be a bit conservative just to be extra careful about DoS | |||
// possibilities in compact block processing... | |||
if (pindex->nHeight <= chainActive.Height() + 2) { | |||
if ((!fAlreadyInFlight && nodestate->nBlocksInFlight < MAX_BLOCKS_IN_TRANSIT_PER_PEER) || | |||
fInFlightFromSamePeer) { | |||
if ((countPartialBlocksStarted < MAX_CMPCTBLOCKS_INFLIGHT_PER_BLOCK && nodestate->nBlocksInFlight < MAX_BLOCKS_IN_TRANSIT_PER_PEER)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I read this, the use of countPartialBlocksStarted, instead of a countBlocksStarted, means that we will request up to two compact blocks at a time, even if we are already requesting the full block from a peer. This seems strange to me, why not just max 2 in-flights at the same time for a given block, with the second never being a full block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why drop the fInFlightFromSamePeer option? It looks like we'll never getblocktxn from two peers simultaneously?
!(pindexLast->nStatus & BLOCK_HAVE_DATA) && | ||
(!IsWitnessEnabled(pindexLast->pprev, chainparams.GetConsensus()) || State(pfrom->GetId())->fHaveWitness) && | ||
nodestate->fSupportsDesiredCmpctVersion) { | ||
vToFetch.push_back(pindexLast); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further, simply adding the entry to vToFetch here may result in a null pointer dereference, I believe. If a peer announces two headers messages back-to-back, the first time we will MarkBlockAsInFlight to them, and the second time we'll hit this condition and add the entry to vToFetch again (which should also be fixed). Further down, we'll MarkBlockAsInFlight to them again, but MarkBlockAsInFlight requires that, if the block is already in-flight to the same peer, pit be something non-NULL as it will be dereferenced, but it is NULL in the call below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the above needs fixing in three ways - MarkBlockAsInFlight needs to be more robust against NULL pit, the request needs to move into this if statement and skip the remainder of this block of code, and we shouldn't double-request from the same peer.
@@ -2094,6 +2107,7 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr | |||
return true; | |||
} else if (status == READ_STATUS_FAILED) { | |||
// Duplicate txindexes, the block is now in-flight, so just request it | |||
// NOTE: This is the one place two full block requests can be outstanding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so why not just check fPeerWasFirstRequest and MarkBlockAsNotInFlight otherwise?
@@ -9,6 +9,8 @@ | |||
#include "net.h" | |||
#include "validationinterface.h" | |||
|
|||
/** Maximum number of outstanding CMPCTBLOCK requests for the same block. */ | |||
static const int MAX_CMPCTBLOCKS_INFLIGHT_PER_BLOCK = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be configurable? Might make sense for miners to have this higher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely not, asking all your peers for a copy of the block poses the same network-DoS risks as connecting to hundreds of nodes, which people like to do because they believe it will help (though it usually actually hurts) them get their blocks out faster.
More importantly, if you're a miner and have good peers, I think it'd be somewhat rare for you to receive a third compact block announce before the first can respond to your blocktxn request, at least it will be once we get proper multi-threaded ProcessMessages implemented to respond to blocktxn requests for the latest block in the background (see #10652 for the beginnings of the steps to do so).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think the upper limit is before it would generally cause a negative impact? Maybe just have an upper limit like we do for max outbound connections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably around 2 :p. Its really only useful if your peer got stuck doing something and wasn't able to respond or is being actively malicious. Once we've fixed the block-on-block-validation-before-responding-to-blocktxn-requests issue, it should be somewhat rare for this to help more than a very small amount.
More concurrency! concept ACK |
This should likely be closed in favor of #10984. |
Should this still be closed in favor of #10984? |
This is built off of #9375, #9252, and #9400. I'll properly rebase it when those are merged.
It provides special logic to issue a second getdata request for a cmpctblock if there is only 1 request outstanding and we think the announced block would be our new tip.
It also changes the cmpctblock processing logic to be first come first served (regardless of in flight block requests) and allow two simultaneous compact block reconstructions.
So in particular, it is now possible to:
Upon receiving a cmpctblock from peers 1 or 2 after this, it will be treated the same as receiving an unsolicited cmpctblock from peer 5 and it will attempt opportunistic reconstruction but not request blocktxn. It will also remove the block in flight for peer 1 or 2.
I believe that the multiple requests is an acceptable increase in bandwidth in order to provide robustness against a single peer stalling us at any point in the logic. The reason to allow a second request for a cmpctblock is to allow LB peers a chance to become HB even if there is a staller who is always announcing first, otherwise, only existing HB peers would have a chance to deliver the block.