Skip to content

Conversation

sdaftuar
Copy link
Member

@sdaftuar sdaftuar commented Oct 20, 2017

This builds on #11490 and is related to the work there, but I've separated this for review in order to not hold up #11490 (which I think is done).

This pr attempts to implement a strategy suggested by @gmaxwell, of choosing a peer for outbound eviction if our tip has not been updated in a while. I'd like to suggest this for consideration in 0.15.0.2 as well, as it is designed to mitigate p2p disruption in the event that all our outbound peers stop relaying blocks to us.

@theuni If you have a chance to review, I could use feedback on the first new commit here (Add hacky accessors for manipulating connman peers in tests). I did the quickest thing I could in order to get a unit test working, but I'm pretty sure I violated all sorts of design goals, so I could use some guidance about the right way to add peers to connman in a unit test.

sdaftuar and others added 4 commits October 19, 2017 20:29
When in IBD, we'd like to use all our outbound peers to help us
sync the chain.  Disconnect any outbound peers whose headers have
insufficient work.
Currently we have no rotation of outbound peers.  If an outbound peer
stops serving us blocks, or is on a consensus-incompatible chain with
less work than our tip (but otherwise valid headers), then we will never
disconnect that peer, even though that peer is using one of our 8
outbound connection slots.  Because we rely on our outbound peers to
find an honest node in order to reach consensus, allowing an
incompatible peer to occupy one of those slots is undesirable,
particularly if it is possible for all such slots to be occupied by such
peers.

Protect against this by always checking to see if a peer's best known
block has less work than our tip, and if so, set a 20 minute timeout --
if the peer is still not known to have caught up to a chain with as much
work as ours after 20 minutes, then send a single getheaders message,
wait 2 more minutes, and if a better header hasn't been received by then,
disconnect that peer.

Note:

- we do not require that our peer sync to the same tip as ours, just an
equal or greater work tip.  (Doing otherwise would risk partitioning the
network in the event of a chain split, and is also unnecessary.)

- we pick 4 of our outbound peers and do not subject them to this logic,
to be more conservative. We don't wish to permit temporary network
issues (or an attacker) to excessively disrupt network topology.
@fanquake fanquake added the P2P label Oct 20, 2017
@sdaftuar sdaftuar force-pushed the 2017-10-stale-tip-eviction branch from f2e22d1 to 2ea6f2c Compare October 20, 2017 19:30
@@ -2862,3 +2878,15 @@ uint64_t CConnman::CalculateKeyedNetGroup(const CAddress& ad) const

return GetDeterministicRandomizer(RANDOMIZER_ID_NETGROUP).Write(vchNetGroup.data(), vchNetGroup.size()).Finalize();
}

void CConnman::AddToVNodes(CNode &node)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only meant for tests?

Otherwise replace the following code with a call to this function?

bitcoin/src/net.cpp

Lines 1140 to 1143 in ff92fbf

{
LOCK(cs_vNodes);
vNodes.push_back(pnode);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is only for the unit test; I'm hoping someone is able to suggest a better way of doing this than what I did here to get the test working.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sdaftuar see 6f6e7e0 for an alternative that doesn't change the production interface.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with leaving the hacks for now, though please make them private and a friend to the boost test functions.

After we merge #11457 (and the follow-up for addrman: https://github.com/theuni/bitcoin/tree/move-addrdb), I believe we'll have enough broken out to create 2 CConnman instances and run them against each-other for these tests. At that point, we can remove the test hacks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@promag Thanks for the patch, I'm going to include it in #11560

@sdaftuar sdaftuar force-pushed the 2017-10-stale-tip-eviction branch from 2ea6f2c to c071f62 Compare October 23, 2017 12:48
int64_t worst_peer_connect_time = 0;
for (auto it = mapNodeState.begin(); it != mapNodeState.end(); ++it) {
int64_t connect_time = 0;
if (connman->ForNode(it->first, [&](CNode *pnode){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we might turn this logic around: rather than marking for disconnection, mark for protection from disconnection. For example, each time we get a new header that extends our tip, we could do:

connman->ResetOutboundEvictionTimer(id, current_time + new_timeout);

Then when we need a connection slot, if any timer has expred, whichever has been expired the longest gets dropped.
That saves us from having to care about what type of node (inbound/outbound/oneshot/manual/etc.) this is, and just lets outbound eviction do its thing if the criteria are met.

@@ -127,6 +127,12 @@ namespace {
/** Number of outbound peers with m_protect_from_disconnect. */
int g_outbound_peers_with_protect_from_disconnect = 0;

/** When to next check whether our tip is stale. */
int64_t g_tip_stale_check_time = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these into PeerLogicValidation ?

@sdaftuar
Copy link
Member Author

Closing in favor of #11560

@sdaftuar sdaftuar closed this Oct 26, 2017
laanwj added a commit that referenced this pull request Nov 2, 2017
6262915 Add unit test for stale tip checking (Suhas Daftuar)
83df257 Add CConnmanTest to mutate g_connman in tests (João Barbosa)
ac7b37c Connect to an extra outbound peer if our tip is stale (Suhas Daftuar)
db32a65 Track tip update time and last new block announcement from each peer (Suhas Daftuar)
2d4327d net: Allow connecting to extra outbound peers (Suhas Daftuar)

Pull request description:

  This is an alternative approach to #11534.  Rather than disconnect an outbound peer when our tip looks stale, instead try to connect to an additional outbound peer.

  Periodically, check to see if we have more outbound peers than we target (ie if any extra peers are in use), and if so, disconnect the one that least recently announced a new block (breaking ties by choosing the newest peer that we connected to).

Tree-SHA512: 8f19e910e0bb36867f81783e020af225f356451899adfc7ade1895d6d3bd5afe51c83759610dfd10c62090c4fe404efa0283b2f63fde0bd7da898a1aaa7fb281
codablock pushed a commit to codablock/dash that referenced this pull request Sep 26, 2019
6262915 Add unit test for stale tip checking (Suhas Daftuar)
83df257 Add CConnmanTest to mutate g_connman in tests (João Barbosa)
ac7b37c Connect to an extra outbound peer if our tip is stale (Suhas Daftuar)
db32a65 Track tip update time and last new block announcement from each peer (Suhas Daftuar)
2d4327d net: Allow connecting to extra outbound peers (Suhas Daftuar)

Pull request description:

  This is an alternative approach to bitcoin#11534.  Rather than disconnect an outbound peer when our tip looks stale, instead try to connect to an additional outbound peer.

  Periodically, check to see if we have more outbound peers than we target (ie if any extra peers are in use), and if so, disconnect the one that least recently announced a new block (breaking ties by choosing the newest peer that we connected to).

Tree-SHA512: 8f19e910e0bb36867f81783e020af225f356451899adfc7ade1895d6d3bd5afe51c83759610dfd10c62090c4fe404efa0283b2f63fde0bd7da898a1aaa7fb281
codablock pushed a commit to codablock/dash that referenced this pull request Sep 29, 2019
6262915 Add unit test for stale tip checking (Suhas Daftuar)
83df257 Add CConnmanTest to mutate g_connman in tests (João Barbosa)
ac7b37c Connect to an extra outbound peer if our tip is stale (Suhas Daftuar)
db32a65 Track tip update time and last new block announcement from each peer (Suhas Daftuar)
2d4327d net: Allow connecting to extra outbound peers (Suhas Daftuar)

Pull request description:

  This is an alternative approach to bitcoin#11534.  Rather than disconnect an outbound peer when our tip looks stale, instead try to connect to an additional outbound peer.

  Periodically, check to see if we have more outbound peers than we target (ie if any extra peers are in use), and if so, disconnect the one that least recently announced a new block (breaking ties by choosing the newest peer that we connected to).

Tree-SHA512: 8f19e910e0bb36867f81783e020af225f356451899adfc7ade1895d6d3bd5afe51c83759610dfd10c62090c4fe404efa0283b2f63fde0bd7da898a1aaa7fb281
barrystyle pushed a commit to PACGlobalOfficial/PAC that referenced this pull request Jan 22, 2020
6262915 Add unit test for stale tip checking (Suhas Daftuar)
83df257 Add CConnmanTest to mutate g_connman in tests (João Barbosa)
ac7b37c Connect to an extra outbound peer if our tip is stale (Suhas Daftuar)
db32a65 Track tip update time and last new block announcement from each peer (Suhas Daftuar)
2d4327d net: Allow connecting to extra outbound peers (Suhas Daftuar)

Pull request description:

  This is an alternative approach to bitcoin#11534.  Rather than disconnect an outbound peer when our tip looks stale, instead try to connect to an additional outbound peer.

  Periodically, check to see if we have more outbound peers than we target (ie if any extra peers are in use), and if so, disconnect the one that least recently announced a new block (breaking ties by choosing the newest peer that we connected to).

Tree-SHA512: 8f19e910e0bb36867f81783e020af225f356451899adfc7ade1895d6d3bd5afe51c83759610dfd10c62090c4fe404efa0283b2f63fde0bd7da898a1aaa7fb281
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants