Skip to content

Conversation

sdaftuar
Copy link
Member

Change getheaders messages so that we wait up to 2 minutes for a response to a prior getheaders message before issuing a new one.

Also change the handling of the getheaders message sent in response to a block INV, so that we no longer use the hashstop variable (including the hash stop will just mean that if our peer's headers chain is longer, then we won't learn
it, so there's no benefit to using hashstop).

Also, now respond to a getheaders during IBD with an empty headers message (rather than nothing) -- this better conforms to the intent of the new logic that it's better to not ignore a peer's getheaders message, even if you have nothing to give. This also avoids a lot of functional tests breaking.

This PR also reworks the headers processing logic to make it more readable.

@fanquake fanquake added the P2P label Jun 22, 2022
@sdaftuar
Copy link
Member Author

@ajtowns This should fix the issue you've seen in functional tests, where generating a bunch of blocks at once can result in O(n^2) headers downloads.

@sdaftuar sdaftuar force-pushed the 2022-06-single-getheaders branch from ccda6c1 to f757d2a Compare June 23, 2022 02:00
@DrahtBot
Copy link
Contributor

DrahtBot commented Jun 23, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #25515 ([draft] PeerManager unit tests by dergoegge)
  • #25514 (net processing: Move CNode::nServices and CNode::nLocalServices to Peer by dergoegge)
  • #25268 (refactor: Introduce EvictionManager by dergoegge)
  • #25203 (logging: update to severity-based logging by jonatack)
  • #24571 (p2p: Prevent block index fingerprinting by sending additional getheaders messages by dergoegge)
  • #24008 (assumeutxo: net_processing changes by jamesob)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@laanwj
Copy link
Member

laanwj commented Jun 23, 2022

Concept ACK, great catch

Copy link
Member

@dergoegge dergoegge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

@sdaftuar sdaftuar force-pushed the 2022-06-single-getheaders branch from f757d2a to f2e3f6d Compare June 24, 2022 12:17
Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

@sdaftuar sdaftuar force-pushed the 2022-06-single-getheaders branch from f2e3f6d to af850df Compare June 24, 2022 17:07
@sdaftuar sdaftuar force-pushed the 2022-06-single-getheaders branch from af850df to f54b86a Compare June 27, 2022 12:48
@sdaftuar sdaftuar force-pushed the 2022-06-single-getheaders branch from f54b86a to e3b2185 Compare June 27, 2022 21:08
@sdaftuar sdaftuar force-pushed the 2022-06-single-getheaders branch 3 times, most recently from 97ced75 to 7f75eba Compare June 28, 2022 19:11
sdaftuar added 5 commits June 28, 2022 15:53
Also moves the call to happen directly after validation of a headers message
(rather than mixed in with other state updates for the peer), and removes an
incorrect comment in favor of one that explains why headers sync must continue
from the last header a peer has sent.
Copy link
Member

@sipa sipa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 99f4785

Nits are very non-blocking

* We'll send a getheaders message in response to try to connect the chain.
*
* The peer can send up to MAX_UNCONNECTING_HEADERS in a row that
* don't connect before given DoS points.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, typo: "before being given DoS points" ?

Copy link
Member

@maflcko maflcko Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done as part of #25555

@@ -82,7 +82,7 @@ def run_test(self):
msg.hashstop = 0
peer.send_and_ping(msg)
time.sleep(5)
assert "headers" not in peer.last_message
assert ("headers" not in peer.last_message or len(peer.last_message["headers"].headers) == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: braces are unnecessary, I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done as part of 8efa73e

@dergoegge
Copy link
Member

Code review ACK 99f4785

@maflcko maflcko changed the title p2p, refactor: Avoid multiple getheaders messages in flight to the same peer p2p: Avoid multiple getheaders messages in flight to the same peer Jul 4, 2022
Copy link
Contributor

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tACK 99f4785 Good improvement in the code.

Tested on signet.

@mzumsande
Copy link
Contributor

Code Review ACK 99f4785

@fanquake fanquake merged commit 87d0123 into bitcoin:master Jul 4, 2022
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jul 4, 2022
Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some questions

// could be benign.
HandleFewUnconnectingHeaders(pfrom, peer, headers);
} else {
Misbehaving(peer, 10, "invalid header received");
Copy link
Member

@maflcko maflcko Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7f24508 It might be good to mention in the commit message that you are changing the points in this "move" commit.

Previously a non-connecting non-continuous header sequence of size MAX_BLOCKS_TO_ANNOUNCE+1 was assigned 20. Now it is assigned 10, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still gets a score of 20 via HandleFewUnconnectingHeaders() when nCount <= MAX_BLOCKS_TO_ANNOUNCE; this adds an unavoidable score of 10 for unconnecting headers when nCount > MAX_BLOCKS_TO_ANNOUNCE which previously was more or less ignored...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously a non-connecting non-continuous header sequence of size MAX_UNCONNECTING_HEADERS would be submitted to validation through ProcessNewBlockHeaders(), since the if (!m_chainman.m_blockman.LookupBlockIndex(headers[0].hashPrevBlock) && nCount <= MAX_BLOCKS_TO_ANNOUNCE) block would be skipped.

That would return BlockValidationResult::BLOCK_MISSING_PREV:

bitcoin/src/validation.cpp

Lines 3590 to 3593 in 691a087

if (mi == m_blockman.m_block_index.end()) {
LogPrint(BCLog::VALIDATION, "%s: %s prev block not found\n", __func__, hash.ToString());
return state.Invalid(BlockValidationResult::BLOCK_MISSING_PREV, "prev-blk-not-found");
}

which would result in 10 misbehaviour points in MaybePunishNodeForBlock

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this adds an unavoidable score of 10 for unconnecting headers when nCount > MAX_BLOCKS_TO_ANNOUNCE which previously was more or less ignored...

Can you explain a bit more why this was "more or less ignored"?

Previously it was deterministically and consistently assigned a Misbehaving(peer, 20, "non-continuous headers sequence");.

Now it is deterministically and consistently assigned a Misbehaving(peer, 10, "invalid header received");.

Previously a non-connecting non-continuous header sequence of size MAX_UNCONNECTING_HEADERS would be submitted to validation through ProcessNewBlockHeaders()

No, it wouldn't, as it would be rejected by the Misbehaving(peer, 20, "non-continuous headers sequence"); check?

What am I missing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What am I missing?

Nothing. I misread non-continuous as continuous. You're right that this would be rejected with Misbehaving(peer, 20, "non-continuous headers sequence")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what to say here other than that yes, there are minor behavior changes in this PR. I don't think any are significant as the DoS points here are pretty arbitrary and when checking for different failures, it shouldn't matter too much which order we do things. I would have been happy to update the commit message as you suggest but now that this is merged there's not much I can do!

It should be the case that any failure before that resulted in DoS points would still result in DoS points now, I think -- if that is not the case, then that would be an oversight to fix. The motivation for making these kinds of minor changes is to make the headers processing easier to reason about, which is also in preparation for another change I'll be proposing to headers sync in the future (which I think will be easier to understand with the headers processing logic that is introduced in this PR).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it wasn't meant as a criticism of 7f24508 more as a nit to keep in mind in the future. I think the code changes in this commit are perfectly fine. Generally I just prefer to split behaviour changes from refactoring changes, or at least explain the behaviour changes in the commit message briefly. Otherwise, what seems obvious to the author is easily missed by reviewers. (I think this discussion is supporting evidence enough).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously a non-connecting non-continuous header sequence of size MAX_BLOCKS_TO_ANNOUNCE+1 was assigned 20. Now it is assigned 10, no?
Previously it was deterministically and consistently assigned a Misbehaving(peer, 20, "non-continuous headers sequence");.

Ah, sorry, I was only thinking of the non-connecting continuous case, not the non-connecting, non-continuous case.

@@ -358,7 +358,7 @@ struct Peer {
std::deque<CInv> m_getdata_requests GUARDED_BY(m_getdata_requests_mutex);

/** Time of the last getheaders message to this peer */
std::atomic<std::chrono::seconds> m_last_getheaders_timestamp{0s};
std::atomic<NodeClock::time_point> m_last_getheaders_timestamp{NodeSeconds{}};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there even a need to make this atomic? In a single threaded environment this is not needed. And if the same peer is served by several threads (for whatever reason) in the future, an atomic may prevent UB, but doesn't seem sufficient to prevent races logically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that is an interesting point. My sense was that we don't want to rely on implicit single-threadedness to reason about data races, so it was safer to just throw this into an atomic and not worry about it.

I also think that having multiple threads servicing the same peer would not really work for a bunch of reasons (including the one you give, about logical errors in code like this). I would imagine that multithreading network handling in the future would involve different threads servicing different peers.

I think the most likely form of UB from data races would be if we were to expose this value via RPC in the future. So to make this future-proof, I'd prefer that we leave this as-is; it doesn't seem like the performance hit of having an atomic here (which is rarely accessed, as getheaders messages are not a frequent occurrence, and there are far more effective ways to CPU DoS a node than to trigger a getheaders) is significant enough to warrant making this harder to reason about in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if this was ever exposed on RPC, we'd also want to think which other fields to return atomically in the response, in which case a per-field atomic isn't enough to prevent logic races. So to me it seems slightly preferable to leave the design of thread safety to when it is needed. Otherwise it may be assumed in the future that this is already perfectly thread safe and can be used as is. Leaving this plain (without atomic) would at least have a thread sanitizer failure hint that the design needs to be re-thought if this was accessed by more than one thread in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Other than that, agree that atomic doesn't affect performance)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know, I think you're right that this atomic is incorrect. We need to hold a lock throughout MaybeSendGetHeaders() because this variable can be accessed in the scheduler thread or in the net processing thread, and if we don't hold a lock throughout that function we can get a data race. Thanks for asking about this, I'll open a PR to fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"in-thread" means running in the thread "ThreadMessageHandler (b-msghand): Application level message handling (sending and receiving). Almost all net_processing and validation logic runs on this thread."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, m_last_getheaders_timestamp is accessed from PeerManagerImpl::MaybeSendGetHeaders() and PeerManagerImpl::ProcessMessage() which are only called from a single thread ThreadMessageHandler().

because this variable can be accessed in the scheduler thread or in the net processing thread

@sdaftuar, I think it is not accessed from the scheduler thread?

I am in favor of simplicity, in this case meaning to not protect data that is accessed by a single thread. If in the future it is accessed by another thread, then an appropriate protection should be added. That may be an atomic or a mutex just for that variable or a mutex for multiple variables depending on the need. Otherwise we may end up protecting it needlessly now and later to have to rework the protection anyway (or worse - use the available, inappropriate protection, like @MarcoFalke mentioned above "...it may be assumed in the future that this is already perfectly thread safe...").

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit on this? Unless I am missing something MaybeSendGetHeaders/ConsiderEviction is only called in-thread, not to be confused with CheckForStaleTipAndEvictPeers, which is called out-of-thread. What am I missing?

Ah, I guess I misremembered how ConsiderEviction is called (and thought it was run in the scheduler thread).

I'll update my other PR and drop the mutex/atomic altogether.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor of simplicity, in this case meaning to not protect data that is accessed by a single thread.

Possible idea for the future. In theory could have SingleThread<T> class similar to the Synced<T> class from #25390 that wraps an object and asserts in debug mode that is always accessed from the same thread. I think a solution like that would be overkill here, and I'm struggling to think of cases where it wouldn't be overkill, but it wouldn't be hard to implement and could be useful to help code document assumptions it is making.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be resolved by 8efa73e

maflcko pushed a commit that referenced this pull request Jul 14, 2022
613e221 test: remove unnecessary parens (Suhas Daftuar)
e939cf2 Remove atomic for m_last_getheaders_timestamp (Suhas Daftuar)

Pull request description:

  Eliminate the unnecessary atomic guarding `m_last_getheaders_timestamp`, which is only accessed in a single thread (thanks to MarcoFalke for pointing this out).

  Also address a nit that came up in #25454.

ACKs for top commit:
  MarcoFalke:
    review ACK 613e221
  vasild:
    ACK 613e221

Tree-SHA512: 6d6c473735b450b8ad43aae5cf16ed419154d72f4a05c0a6ce6f26caecab9db2361041398b70bf9395611c107d50897f501fa5fdbabb2891144bbc2b479dfdad
// nUnconnectingHeaders gets reset back to 0.
if (!m_chainman.m_blockman.LookupBlockIndex(headers[0].hashPrevBlock) && nCount <= MAX_BLOCKS_TO_ANNOUNCE) {
nodestate->nUnconnectingHeaders++;
m_connman.PushMessage(&pfrom, msgMaker.Make(NetMsgType::GETHEADERS, m_chainman.ActiveChain().GetLocator(m_chainman.m_best_header), uint256()));
Copy link
Member

@Sjors Sjors Aug 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

msgMaker is now unused in ProcessHeadersMessage. This is fixed in #25717

@hebasto
Copy link
Member

hebasto commented Mar 30, 2023

msgMaker is now unused in ProcessHeadersMessage.

Same for PeerManagerImpl::HandleFewUnconnectingHeaders() and PeerManagerImpl::ConsiderEviction().
See #27368.

fanquake added a commit to bitcoin-core/gui that referenced this pull request Mar 30, 2023
…ker` instances

ea7ec78 refactor: Drop no longer used `CNetMsgMaker` instances (Hennadii Stepanov)

Pull request description:

  The removed lines have been unused since the bitcoin/bitcoin@abf5d16 commit from bitcoin/bitcoin#25454.

ACKs for top commit:
  dergoegge:
    utACK ea7ec78
  Sjors:
    ACK ea7ec78
  TheCharlatan:
    ACK ea7ec78

Tree-SHA512: 9a2a9ff3f124b68a8cd20a637e90885096996c3aa354a4d8adbec98f5761e9e826c1c064ccd90aaf6d72beac61dd9e22c8b76d089e18bba6e0ad51e59a9c7df8
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Apr 1, 2023
…tances

ea7ec78 refactor: Drop no longer used `CNetMsgMaker` instances (Hennadii Stepanov)

Pull request description:

  The removed lines have been unused since the bitcoin@abf5d16 commit from bitcoin#25454.

ACKs for top commit:
  dergoegge:
    utACK ea7ec78
  Sjors:
    ACK ea7ec78
  TheCharlatan:
    ACK ea7ec78

Tree-SHA512: 9a2a9ff3f124b68a8cd20a637e90885096996c3aa354a4d8adbec98f5761e9e826c1c064ccd90aaf6d72beac61dd9e22c8b76d089e18bba6e0ad51e59a9c7df8
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Dec 5, 2023
Summary:
This introduces a minor change in the misbehavior score: previously a non-connecting non-continuous header sequence of size MAX_BLOCKS_TO_ANNOUNCE+1 was assigned 20 (`non-continuous headers sequence`), now it is assigned 10 (`invalid header received`).
The behavior is unchanged if the number of headers is <= MAX_BLOCKS_TO_ANNOUNCE.

This is a partial backport of [[bitcoin/bitcoin#25454 | core#25454]]
bitcoin/bitcoin@7f24508

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D14906
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Dec 5, 2023
Summary:
This is a partial backport of [[bitcoin/bitcoin#25454 | core#25454]]
bitcoin/bitcoin@9492e93

Depends on D14906

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D14907
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Dec 5, 2023
Summary:
Also moves the call to happen directly after validation of a headers message
(rather than mixed in with other state updates for the peer), and removes an
incorrect comment in favor of one that explains why headers sync must continue
from the last header a peer has sent.

This is a backport of [[bitcoin/bitcoin#25454 | core#25454]]
bitcoin/bitcoin@bf8ea6d

Depends on D14907

Note that `FetchMoreHeaders` requires the `cs_main` lock to be held because of the use of `ActiveChain()`. This is needed in the local scope because of  the lock splitting in D9591.

Test Plan:
With clang and debug:
`ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D14908
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Dec 5, 2023
Summary:
And move headers direct fetch to end of ProcessHeadersMessage

Review tip:
`git show --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space`

This should show only linter changes due to reduced indentation in the moved code block.

This is a partial backport of [[bitcoin/bitcoin#25454 | core#25454]]
bitcoin/bitcoin@29c4518
bitcoin/bitcoin@2b341db
Depends on D14908

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D14909
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Dec 5, 2023
Summary:
Review hint:
`git show --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space`

This is a partial backport of [[bitcoin/bitcoin#25454 | core#25454]]
bitcoin/bitcoin@6d95cd3

Depends on D14909

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D14910
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Dec 5, 2023
Summary:
This is a partial backport of [[bitcoin/bitcoin#25454 | core#25454]]
bitcoin/bitcoin@ffe87db

Depends on D14910

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D14911
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Dec 6, 2023
Summary:
Change getheaders messages so that we wait up to 2 minutes for a response to a
prior getheaders message before issuing a new one.

Also change the handling of the getheaders message sent in response to a block
INV, so that we no longer use the hashstop variable (including the hash stop
will just mean that if our peer's headers chain is longer, then we won't learn
it, so there's no benefit to using hashstop).

Also, now respond to a getheaders during IBD with an empty headers message
(rather than nothing) -- this better conforms to the intent of the new logic
that it's better to not ignore a peer's getheaders message, even if you have
nothing to give. This also avoids a lot of functional tests breaking.

This concludes backport of [[bitcoin/bitcoin#25454 | core#25454]] and [[bitcoin/bitcoin#25557 | core#25557]]
bitcoin/bitcoin@abf5d16
bitcoin/bitcoin@99f4785
bitcoin/bitcoin@e939cf2

Depends on D14912

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D14913
sipa added a commit to sipa/bitcoin that referenced this pull request Mar 7, 2024
This misbehavior was originally intended to prevent bandwidth wastage due to
very broken (but likely non-malicious) nodes that respond to GETHEADERS with
a response unrelated to the request, triggering a request cycle.

This has however largely been addressed by bitcoin#25454, which limits the number
of (not responded to) GETHEADERS requests in flight to one, with a 2 minute
timeout. With that, the specific misbehavior is largely irrelevant (for
inbound peers, it is now harmless; for outbound peers, the eviction logic
will eventually kick them out if they're not keeping up with headers at all).
sipa added a commit to sipa/bitcoin that referenced this pull request Mar 12, 2024
Since bitcoin#25454 we keep track of the last
GETHEADERS request that was sent and wasn't responded to. So far, every incoming
HEADERS message is treated as a response to whatever GETHEADERS was last sent,
regardless of its contents.

This commit makes this tracking more accurate, by only treating HEADERS messages
which (1) are empty, (2) connect to our existing block header tree, or (3) are a
continuation of a low-work headers sync as responses that clear the "outstanding
GETHEADERS" state (m_last_getheaders_timestamp).

That means that HEADERS messages which do not satisfy any of the above criteria
will be ignored, not triggering a GETHEADERS, and potentially (for now, but see
later commit) increase misbehavior score.
@bitcoin bitcoin locked and limited conversation to collaborators Mar 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.