Skip to content

Conversation

TheBlueMatt
Copy link
Contributor

This is a rather long branch, split up into three PRs, with lots of cleanup and tweaks here and there that evntually gets us CNodeStates which have their own locks, locked before cs_main, which makes nodes not always require cs_main to do simple things like call Misbehaving(). The final result is at https://github.com/TheBlueMatt/bitcoin/tree/2017-06-cnodestateaccessors-3 (and next PR is https://github.com/TheBlueMatt/bitcoin/tree/2017-06-cnodestateaccessors-2).

This PR is primarily some small cleanups and the first few commits of #9447 rebased on #10179 and with small tweaks.

The result is that we take the lock for the CNodeState for the node we're processing at the top of ProcessMessages/SendMessages, and hold it until the end, passing around an accessor to the CNodeState as we go. This moves towards a few goals: (1) ability to run ProcessMessages() in paralell, (2) move towards a world where ProcessMessage() is a function in CNodeState, which are objects owned by CConnman.

This isn't a trivial thing to pull off because of lockorder, which introduces two issues:

  • There are a few places we call State() as a result of callbacks from CValidationInterface. Addressed by extending the CValidationInterface threading introduced in Give CValidationInterface Support for calling notifications on the CScheduler Thread #10179 and used for wallet to a few more functions.
  • You can't take two State() accessors at once, because then you'd have undefined lockorder. Luckily with the CValidationInterface threading changes gets most of those, but also the first few commits from Allow 2 simultaneous block downloads #9447 are needed (included here rebased and slightly tweaked) as well as two little tweaks to be able to apply DoS score after unlocking the CNodeState of the node we're processing on.

By the end DEBUG_LOCKORDER features are added to stricly enforce these potential issues (and I've been testing it a bunch in helgrind, etc).

@TheBlueMatt
Copy link
Contributor Author

Ping @theuni.

src/net.cpp Outdated
}

void CConnman::ForEachNode(std::function<void (CNode* pnode)> func)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this discard the rvalue reference && (move schematic)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean it does, but I'm not too worried about requiring move for a callable type (places where we std::move a lambda into this will still get the benifit of calling std::function(Callable&&)). The real concern here, of course, is that it introduces a performance penalty, but its not huge (an extra function call and few if's on top of the callback function call) and we're already calling this with lambda functions, so I dont think we were ever too worried about the performance here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this commit to #10697.

@TheBlueMatt TheBlueMatt force-pushed the 2017-06-cnodestateaccessors-1 branch 2 times, most recently from 653732d to e272fc9 Compare June 28, 2017 21:27
@TheBlueMatt
Copy link
Contributor Author

Rebased on a rebased version of the updated #10179 and filled in more text for the " MarkBlockAsReceived on NewPoWValidBlock at receive." commit message which helps provide a bit more context for this PR:

"Note that this also helps with future per-CNodeState locks, as those
will require no two CNodeState locks to be held at the same time
(for lockorder reasons), and this prevents us from accessing
another peer's mmapBlocksInFlight entry during processing as we can
move the NewPoWValidBlock callback into the CValidationInterface
background thread mechanics."

// (but if it does not build on our best tip, let the SendMessages loop relay it)
if (!IsInitialBlockDownload() && chainActive.Tip() == pindex->pprev)
GetMainSignals().NewPoWValidBlock(pindex, pblock);
GetMainSignals().NewPoWValidBlock(pindex, pblock, chainActive.Tip() == pindex->pprev);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this wants fInitialDownload as a param so that it's always evaluated in its receiving context rather than having subscribers potentially coming to different conclusions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm? I'm not sure I caught your point - we never get here if we already have the block on disk, and the new uses of it in net_processing need to know that we got here for any ProcessNewBlock call they make.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was that IsInitialBlockDownload() may be false by the time it's checked in a callback on a separate thread (once that stuff comes in), though it was true in AcceptBlock. So it's potentially racy to subscribers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, ok. Yes, is not an issue in any of my work, I believe (worst-case you have an extra few fast-announcements when you first leave IBD), but I added a comment noting the issue.

// TODO: Only process if requested from this peer?
forceProcessing |= mmapBlocksInFlight.count(hash);
// Block is no longer in flight from this peer
MarkBlockAsNotInFlight(hash, pfrom->GetId());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably makes sense for MarkBlockAsNotInFlight() to return whether it removed something for the peer? I think that would take care of the TODO as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the TODO has a "?" on it (and I dunno if we want it or not), I'll leave this for a later PR?

@TheBlueMatt TheBlueMatt force-pushed the 2017-06-cnodestateaccessors-1 branch from e272fc9 to 0651b31 Compare July 7, 2017 21:28
@TheBlueMatt
Copy link
Contributor Author

Rebased.

@TheBlueMatt TheBlueMatt force-pushed the 2017-06-cnodestateaccessors-1 branch from 0651b31 to 0010932 Compare July 7, 2017 23:17
Copy link
Contributor

@morcos morcos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me.
utACK 0010932

range.first++;
if (itInFlight->second.first == nodeid) {
if (clearState)
ClearDownloadState(itInFlight);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same line or braces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -2024,7 +2066,7 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
}

// If we're not close to tip yet, give up and let parallel block fetch work its magic
if (!fAlreadyInFlight && !CanDirectFetch(chainparams.GetConsensus()))
if (!fInFlightFromSamePeer && !CanDirectFetch(chainparams.GetConsensus()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit hesitant about this change. I can't really remember exactly why we have this CanDirectFetch logic in compact block processing. But this could make us unnecessarily avoid an optimistic compact block reconstruction if for some reason it took more than 200 mins to receive 2 blocks (right we do reconstructions if within 2 blocks of the announced block?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually now that I think about it, that concern exists prior to the changes in this PR.
Why do we have the CanDirectFetch guard for compact block processing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think you'd have to go 200 minutes for one block for it to matter (if you go 200 minutes for two, and missed the first one, you wont compact-download that older one, but thats OK, you were probably offline or something anyway). Still, we dont have a much better way to say "ok, I really, really only want to do this if I'm caught up" (ie cause otherwise mempool will be worthless). We could drop it - its not the end of the world to take the extra RT to use the compact blocks and get a tiny bit of compression, but at that point you may very well be just as well off just doing a regular block fetch and not taking the extra RT.

@TheBlueMatt TheBlueMatt force-pushed the 2017-06-cnodestateaccessors-1 branch 2 times, most recently from 7145218 to ca66c4a Compare July 11, 2017 15:15
@TheBlueMatt
Copy link
Contributor Author

TheBlueMatt commented Jul 11, 2017

Rebased to remove the useless old commits, but no changes. Still would be nice to see thtis for 0.15, but not a huge deal if it misses.

@laanwj laanwj added this to the 0.15.0 milestone Jul 11, 2017
@morcos
Copy link
Contributor

morcos commented Jul 11, 2017

utACK ca66c4a

range.first++;
if (itInFlight->second.first == nodeid) {
if (clearState)
ClearDownloadState(itInFlight);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing block or same line as if ().

bool CScheduler::AreThreadsServicingQueue() const {
return nThreadsServicingQueue;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove extra line.

@@ -139,3 +139,69 @@ size_t CScheduler::getQueueInfo(boost::chrono::system_clock::time_point &first,
}
return result;
}

bool CScheduler::AreThreadsServicingQueue() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{ in new line for all functions.

it->second.first != pfrom->GetId()) {
bool fExpectedBLOCKTXN = false;
std::pair<BlockDownloadMap::iterator, BlockDownloadMap::iterator> rangeInFlight = mmapBlocksInFlight.equal_range(resp.blockhash);
BlockDownloadMap::iterator it = rangeInFlight.first;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove it and mutate rangeInFlight.first.

while (it != rangeInFlight.second) {
if (it->second.first == pfrom->GetId()) {
if (it->second.second->partialBlock)
fExpectedBLOCKTXN = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (it->second.first == pfrom->GetId()) {
    fExpectedBLOCKTXN = it->second.second->partialBlock;
    break;
}

std::map<uint256, std::pair<NodeId, std::list<QueuedBlock>::iterator> >::iterator it = mapBlocksInFlight.find(resp.blockhash);
if (it == mapBlocksInFlight.end() || !it->second.second->partialBlock ||
it->second.first != pfrom->GetId()) {
bool fExpectedBLOCKTXN = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected_BLOCKTXN?

@@ -2382,7 +2423,10 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
LOCK(cs_main);
// Also always process if we requested the block explicitly, as we may
// need it even though it is not a candidate for a new best tip.
forceProcessing |= MarkBlockAsReceived(hash);
// TODO: Only process if requested from this peer?
forceProcessing |= mmapBlocksInFlight.count(hash);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace these mmapBlocksInFlight.count() with, for instance:

forceProcessing |= IsBlockInFlight(hash);

@@ -105,7 +105,8 @@ namespace {
bool fValidatedHeaders; //!< Whether this block has validated headers at the time of request.
std::unique_ptr<PartiallyDownloadedBlock> partialBlock; //!< Optional, used for CMPCTBLOCK downloads
};
std::map<uint256, std::pair<NodeId, std::list<QueuedBlock>::iterator> > mapBlocksInFlight;
typedef std::multimap<uint256, std::pair<NodeId, std::list<QueuedBlock>::iterator>> BlockDownloadMap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion, add NodeId to QueuedBlock structure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be a bit cleaner, maybe, but would also make it redundant.

@TheBlueMatt
Copy link
Contributor Author

This should likely be un-tagged 0.15. Its not a critical enough bugfix to merit merging at this late stage.

@maflcko maflcko removed this from the 0.15.0 milestone Jul 18, 2017
@TheBlueMatt TheBlueMatt force-pushed the 2017-06-cnodestateaccessors-1 branch from ca66c4a to 4b0c4b3 Compare August 3, 2017 21:12
morcos and others added 4 commits August 3, 2017 17:30
This pushes some "is this callback useful" logic down into
net_processing, which is useful for later changes as it allows for
more notifications to be used.
The received block could be malleated, so this is both simpler, and
supports parallel downloads.
…ight

This is a change in behavior so that if for some reason we request a block from a peer, we don't allow an unsolicited CMPCT_BLOCK announcement for that same block to cause a request for a full block from the uninvited peer (as some type of request is already outstanding from the original peer)
@TheBlueMatt
Copy link
Contributor Author

Closing as I believe I'm taking a different approach to beter network parallelization, however the commits here are still part of #10984, where they will remain.

laanwj added a commit that referenced this pull request Sep 7, 2017
2525b97 net: stop both net/net_processing before destroying them (Cory Fields)
80e2e9d net: drop unused connman param (Cory Fields)
8ad663c net: use an interface class rather than signals for message processing (Cory Fields)
28f11e9 net: pass CConnman via pointer rather than reference (Cory Fields)

Pull request description:

  See individual commits.
  Benefits:
  - Allows us to begin moving stuff out of CNode and into CNodeState (after #10652 and follow-ups)
  - Drops boost dependency and overhead
  - Drops global signal registration
  - Friendlier backtraces

Tree-SHA512: af2038c959dbec25f0c90c74c88dc6a630e6b9e984adf52aceadd6954aa463b6aadfccf979c2459a9f3354326b5077ee02048128eda2a649236fadb595b66ee3
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Sep 24, 2019
…e class

2525b97 net: stop both net/net_processing before destroying them (Cory Fields)
80e2e9d net: drop unused connman param (Cory Fields)
8ad663c net: use an interface class rather than signals for message processing (Cory Fields)
28f11e9 net: pass CConnman via pointer rather than reference (Cory Fields)

Pull request description:

  See individual commits.
  Benefits:
  - Allows us to begin moving stuff out of CNode and into CNodeState (after bitcoin#10652 and follow-ups)
  - Drops boost dependency and overhead
  - Drops global signal registration
  - Friendlier backtraces

Tree-SHA512: af2038c959dbec25f0c90c74c88dc6a630e6b9e984adf52aceadd6954aa463b6aadfccf979c2459a9f3354326b5077ee02048128eda2a649236fadb595b66ee3
codablock pushed a commit to codablock/dash that referenced this pull request Sep 26, 2019
…e class

2525b97 net: stop both net/net_processing before destroying them (Cory Fields)
80e2e9d net: drop unused connman param (Cory Fields)
8ad663c net: use an interface class rather than signals for message processing (Cory Fields)
28f11e9 net: pass CConnman via pointer rather than reference (Cory Fields)

Pull request description:

  See individual commits.
  Benefits:
  - Allows us to begin moving stuff out of CNode and into CNodeState (after bitcoin#10652 and follow-ups)
  - Drops boost dependency and overhead
  - Drops global signal registration
  - Friendlier backtraces

Tree-SHA512: af2038c959dbec25f0c90c74c88dc6a630e6b9e984adf52aceadd6954aa463b6aadfccf979c2459a9f3354326b5077ee02048128eda2a649236fadb595b66ee3
codablock pushed a commit to codablock/dash that referenced this pull request Sep 29, 2019
…e class

2525b97 net: stop both net/net_processing before destroying them (Cory Fields)
80e2e9d net: drop unused connman param (Cory Fields)
8ad663c net: use an interface class rather than signals for message processing (Cory Fields)
28f11e9 net: pass CConnman via pointer rather than reference (Cory Fields)

Pull request description:

  See individual commits.
  Benefits:
  - Allows us to begin moving stuff out of CNode and into CNodeState (after bitcoin#10652 and follow-ups)
  - Drops boost dependency and overhead
  - Drops global signal registration
  - Friendlier backtraces

Tree-SHA512: af2038c959dbec25f0c90c74c88dc6a630e6b9e984adf52aceadd6954aa463b6aadfccf979c2459a9f3354326b5077ee02048128eda2a649236fadb595b66ee3
barrystyle pushed a commit to PACGlobalOfficial/PAC that referenced this pull request Jan 22, 2020
…e class

2525b97 net: stop both net/net_processing before destroying them (Cory Fields)
80e2e9d net: drop unused connman param (Cory Fields)
8ad663c net: use an interface class rather than signals for message processing (Cory Fields)
28f11e9 net: pass CConnman via pointer rather than reference (Cory Fields)

Pull request description:

  See individual commits.
  Benefits:
  - Allows us to begin moving stuff out of CNode and into CNodeState (after bitcoin#10652 and follow-ups)
  - Drops boost dependency and overhead
  - Drops global signal registration
  - Friendlier backtraces

Tree-SHA512: af2038c959dbec25f0c90c74c88dc6a630e6b9e984adf52aceadd6954aa463b6aadfccf979c2459a9f3354326b5077ee02048128eda2a649236fadb595b66ee3
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants