-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Drop delayed headers logic and fix duplicate initial headers sync by handling block inv correctly #2032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
// Download if this is a nice peer, or we have no nice peers and this one might do. | ||
bool fFetch = state->fPreferredDownload || (nPreferredDownload == 0 && !pfrom->fOneShot); | ||
// Only actively request headers from a single peer, unless we're close to end of initial download. | ||
if ((nSyncStarted == 0 && fFetch) || pindexBestHeader->GetBlockTime() > GetAdjustedTime() - nMaxTipAge) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correct, a block that is being announced while we are not catched up, this blocks header won't be requested from any peer. We assume that the initial sync peer would provide it when it reaches the tip...but what if that peer was a bad choice and he didn't get this block? We'd not see the header of the new block until another block is mined on top of that one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, same as with initial headers sync - we don't care about recent blocks or other peers until we are close to the tip announced by the peer we started syncing from. Once we are close, we start requesting headers from everyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if the initial sync peer is so bad that it never lets us reach near-tip height? In that case we'll not request headers from other peers until timeout kicks in? The timeout is quite long if I remember correct, so experience in this case won't be optimal, but it should sort itself out after some time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it's Timeout = base + per_header * (expected number of headers)
where base
is 15 minutes and per_header
is 1 ms per header. Plus, you can also specify custom -maxtipage
if you know there are some huge forks at the time you are trying to sync, so new peers will be picked up quicker.
Reverts "Fix duplicate headers download in initial sync (dashpay#1589)" and all following fixes This reverts commit 169afaf.
b652f6d
to
753ed61
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
…handling block inv correctly (dashpay#2032) * Drop custom logic for delaying GETHEADERS Reverts "Fix duplicate headers download in initial sync (dashpay#1589)" and all following fixes This reverts commit 169afaf. * Fix duplicate initial headers sync
, bitcoin#24178, bitcoin#24171, bitcoin#25404, bitcoin#25514, bitcoin#25720, partial bitcoin#23832, bitcoin#24169, bitcoin#25454 (headers backports) c92b0f5 merge bitcoin#25720: Reduce bandwidth during initial headers sync when a block is found (Kittywhiskers Van Gogh) 0f9ece0 merge bitcoin#25514: Move CNode::nServices and CNode::nLocalServices to Peer (Kittywhiskers Van Gogh) c9923ca partial bitcoin#25454: Avoid multiple getheaders messages in flight to the same peer (Kittywhiskers Van Gogh) 26d477b revert: Fix duplicate initial headers sync (Kittywhiskers Van Gogh) abccb2d test: drop genesis block from `blockheader_testnet3` (Kittywhiskers Van Gogh) 0574a7d merge bitcoin#25404: Use MAX_BLOCKS_TO_ANNOUNCE consistently (Kittywhiskers Van Gogh) ed871d2 merge bitcoin#24171: Sync chain more readily from inbound peers during IBD (Kittywhiskers Van Gogh) a04290f merge bitcoin#24178: Respond to getheaders if we have sufficient chainwork (Kittywhiskers Van Gogh) bcafa28 merge bitcoin#24909: Move and rename pindexBestHeader, fHavePruned (Kittywhiskers Van Gogh) 70485cb partial bitcoin#24169: Add --enable-c++20 option (Kittywhiskers Van Gogh) 27e885d merge bitcoin#23880: Serialize cmpctblock at most once in NewPoWValidBlock (Kittywhiskers Van Gogh) 9f7ac69 merge bitcoin#24024: Remove cs_main lock annotation from ChainstateManager.m_blockman (Kittywhiskers Van Gogh) 9399f90 partial bitcoin#23832: Changes time variables from int to chrono (Kittywhiskers Van Gogh) Pull request description: ## Additional Information * Dependent on #6085 * Dependency for #6098 * ~~[bitcoin#25514](bitcoin#25514) removes `peers.spvNodeConnections` and `peers.fullNodeConnections` reporting from `CalculateNumConnectionsChangedStats` as the services flags used to distingush between the two have been moved to the `Peer` struct, accessable only through `PeerManager`.~~ ~~As `PeerManager` isn't accessable to `CConnman`, even if a new public function was exposed through `PeerManger` (as we have for `IsInvInFilter` and others or we try to access the value through `GetNodeStateStats`), `CConnman` would be unable to leverage it.~~ Resolved with patch by UdjinM6, thanks! * [bitcoin#23880](bitcoin#23880) introduces code that is not valid C++20 (but valid C++17) and therefore, required a partial backport of [bitcoin#24169](bitcoin#24169) (fae6790) to make the code C++20 legal. * [bitcoin#25454](bitcoin#25454) introduces a 10-point penalty for remitting more than `MAX_BLOCKS_TO_ANNOUNCE` unconnected block headers. `blockheader_testnet3.hex` (introduced in [bitcoin#16551](bitcoin#16551), part of [dash#5963](#5963)), unlike in Bitcoin, includes the genesis block. By definition of a genesis block, there is no block before it that connects to, which causes the 10-point penalty to trip and `p2p_dos_header_tree.py` to fail (see below). This has been remedied by removing the genesis block from the test data to match upstream and also because no node has a good reason to ever broadcast the genesis block as-is over P2P. <details> <summary>Test Failure</summary> ``` dash@6a2649cc721f:/src/dash$ ./test/functional/p2p_dos_header_tree.py 2024-06-17T17:59:35.874000Z TestFramework (INFO): Initializing test directory /tmp/dash_func_test_h5hfluy1 2024-06-17T17:59:36.892000Z TestFramework (INFO): Read headers data 2024-06-17T17:59:36.895000Z TestFramework (INFO): Feed all non-fork headers, including and up to the first checkpoint 2024-06-17T17:59:38.411000Z TestFramework (ERROR): Assertion failed Traceback (most recent call last): File "/src/dash/test/functional/test_framework/test_framework.py", line 158, in main self.run_test() File "./test/functional/p2p_dos_header_tree.py", line 53, in run_test assert { AssertionError 2024-06-17T17:59:38.913000Z TestFramework (INFO): Stopping nodes 2024-06-17T17:59:39.917000Z TestFramework (WARNING): Not cleaning up dir /tmp/dash_func_test_h5hfluy1 2024-06-17T17:59:39.917000Z TestFramework (ERROR): Test failed. Test logging available at /tmp/dash_func_test_h5hfluy1/test_framework.log 2024-06-17T17:59:39.917000Z TestFramework (ERROR): 2024-06-17T17:59:39.917000Z TestFramework (ERROR): Hint: Call /src/dash/test/functional/combine_logs.py '/tmp/dash_func_test_h5hfluy1' to consolidate all logs 2024-06-17T17:59:39.917000Z TestFramework (ERROR): 2024-06-17T17:59:39.917000Z TestFramework (ERROR): If this failure happened unexpectedly or intermittently, please file a bug and provide a link or upload of the combined log. 2024-06-17T17:59:39.917000Z TestFramework (ERROR): https://github.com/dashpay/dash/issues 2024-06-17T17:59:39.917000Z TestFramework (ERROR): ``` </details> * [bitcoin#25454](bitcoin#25454) has a goal similar to [dash#2032](#2032) (and its predecessor, [dash#1589](#1589)), namely, avoiding `getheaders`(`2`) duplication to the same peer. Unfortunately, Dash's mitigation seems to conflict with Bitcoin's mitigation and this results in `feature_minchainwork.py` failing (see below). This has been remedied by partially reverting [dash#2032](#2032). <details> <summary>Test Failure</summary> ``` dash@1bebec413839:/src/dash$ ./test/functional/feature_minchainwork.py 2024-08-01T17:29:41.116000Z TestFramework (INFO): Initializing test directory /tmp/dash_func_test_co8xkx43 2024-08-01T17:29:42.145000Z TestFramework (INFO): Testing relay across node 1 (minChainWork = 101) 2024-08-01T17:29:42.145000Z TestFramework (INFO): Generating 49 blocks on node0 [...] 2024-08-01T17:29:51.707000Z TestFramework (INFO): Generating one more block 2024-08-01T17:29:51.709000Z TestFramework (INFO): Verifying nodes are all synced 2024-08-01T17:30:51.989000Z TestFramework (ERROR): Assertion failed Traceback (most recent call last): File "/src/dash/test/functional/test_framework/test_framework.py", line 159, in main self.run_test() File "./test/functional/feature_minchainwork.py", line 101, in run_test self.sync_all() File "/src/dash/test/functional/test_framework/test_framework.py", line 811, in sync_all self.sync_blocks(nodes) File "/src/dash/test/functional/test_framework/test_framework.py", line 777, in sync_blocks raise AssertionError("Block sync timed out after {}s:{}".format( AssertionError: Block sync timed out after 60s: '00ab061e30aebd2f97153d26cd72f921af896f05c6469ad73c7de4fc283d9590' '00ab061e30aebd2f97153d26cd72f921af896f05c6469ad73c7de4fc283d9590' '000008ca1832a4baf228eb1553c03d3a2c8e02399550dd6ea8d65cec3ef23d2e' 2024-08-01T17:30:52.490000Z TestFramework (INFO): Stopping nodes 2024-08-01T17:30:53.495000Z TestFramework (WARNING): Not cleaning up dir /tmp/dash_func_test_co8xkx43 2024-08-01T17:30:53.495000Z TestFramework (ERROR): Test failed. Test logging available at /tmp/dash_func_test_co8xkx43/test_framework.log 2024-08-01T17:30:53.495000Z TestFramework (ERROR): 2024-08-01T17:30:53.495000Z TestFramework (ERROR): Hint: Call /src/dash/test/functional/combine_logs.py '/tmp/dash_func_test_co8xkx43' to consolidate all logs 2024-08-01T17:30:53.495000Z TestFramework (ERROR): 2024-08-01T17:30:53.495000Z TestFramework (ERROR): If this failure happened unexpectedly or intermittently, please file a bug and provide a link or upload of the combined log. 2024-08-01T17:30:53.495000Z TestFramework (ERROR): https://github.com/dashpay/dash/issues 2024-08-01T17:30:53.495000Z TestFramework (ERROR): ``` </details> * [bitcoin#25720](bitcoin#25720) introduces a new test, `p2p_initial_headers_sync.py`, to validate that when a client has a stale tip, it will only engage in headers sync with one peer (emit a `getheaders2`\* message). Unmodified, this test fails (see below) as while the backport deals with one source of `getheaders2` messages, the test setup unwittingly triggers another ([source](https://github.com/dashpay/dash/blob/2379462294da884d925b5a80acec20fbc360309f/src/net_processing.cpp#L5446-L5448)), specifically, allowing the `pindexBestHeader->GetBlockTime() > GetAdjustedTime() - nMaxTipAge` condition to evaluate `true`. This is because, unlike in Bitcoin test suite's `setup_chain()` ([source](https://github.com/bitcoin/bitcoin/blob/22d96d76ab02fc73e7fe0d810bacee4c982df085/test/functional/test_framework/test_framework.py#L379-L385)), Dash sets the mocktime to match the mock chain ([source](https://github.com/dashpay/dash/blob/2379462294da884d925b5a80acec20fbc360309f/test/functional/test_framework/test_framework.py#L408-L416)) during setup, while the test assumes that the mock chain is stale enough to not trigger this source of `getheaders2` messages. As the tip is barely stale, it emits the `getheaders2` message, which is detected, causing the test to fail. This has been remedied by overriding `setup_chain()` to behave more like Bitcoin's test suite. _\* - `getheaders2` is a Dash-specific message that is courtesy of compressed headers, Bitcoin would be checking for `getheaders`_ <details> <summary>Test Failure</summary> ``` dash@6a2649cc721f:/src/dash$ ./test/functional/p2p_initial_headers_sync.py 2024-06-17T21:24:09.921000Z TestFramework (INFO): Initializing test directory /tmp/dash_func_test_3gypo3ab 2024-06-17T21:24:10.681000Z TestFramework (INFO): Adding a peer to node0 2024-06-17T21:24:11.684000Z TestFramework (INFO): Connecting two more peers to node0 2024-06-17T21:24:13.689000Z TestFramework (INFO): Verify that peer2 and peer3 don't receive a getheaders after connecting 2024-06-17T21:24:15.193000Z TestFramework (ERROR): Assertion failed Traceback (most recent call last): File "/src/dash/test/functional/test_framework/test_framework.py", line 158, in main self.run_test() File "./test/functional/p2p_initial_headers_sync.py", line 60, in run_test assert "getheaders2" not in peer2.last_message AssertionError 2024-06-17T21:24:15.695000Z TestFramework (INFO): Stopping nodes 2024-06-17T21:24:16.698000Z TestFramework (WARNING): Not cleaning up dir /tmp/dash_func_test_3gypo3ab 2024-06-17T21:24:16.698000Z TestFramework (ERROR): Test failed. Test logging available at /tmp/dash_func_test_3gypo3ab/test_framework.log 2024-06-17T21:24:16.699000Z TestFramework (ERROR): 2024-06-17T21:24:16.699000Z TestFramework (ERROR): Hint: Call /src/dash/test/functional/combine_logs.py '/tmp/dash_func_test_3gypo3ab' to consolidate all logs 2024-06-17T21:24:16.699000Z TestFramework (ERROR): 2024-06-17T21:24:16.699000Z TestFramework (ERROR): If this failure happened unexpectedly or intermittently, please file a bug and provide a link or upload of the combined log. 2024-06-17T21:24:16.699000Z TestFramework (ERROR): https://github.com/dashpay/dash/issues 2024-06-17T21:24:16.699000Z TestFramework (ERROR): ``` </details> ## Checklist: - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have added or updated relevant unit/integration/functional/e2e tests - [x] I have made corresponding changes to the documentation - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ Top commit has no ACKs. Tree-SHA512: 16b7c42195e197e8b6800c3425b4ff15a124b0e3e0da5c98ca6e22486b52c4ea82f2f05b83e28e838860b1ce76daa1e435c5eae4fb7591a161f26a5fff6189cc
See individual commits. In short, instead of delaying
getheaders
initiated by block invs, we simply ignore other peers if we already started headers sync from one and we are still far from being fully synced (i.e. the same logic as forinitial-headers-sync
inSendMessages
). If the peer we are syncing from fails to deliver (recent) headers, it will be disconnected later on timeout just like it normally would, we don't need to do anything special about it here. I'm also replacing6 * 60 * 60
withnMaxTipAge
in few places to make headers sync process controllable via-maxtipage
cmd-line param (IBD already usesnMaxTipAge
btw). You can see 2 tests adjusted to use it (they rely on either having very long chain or very old blocks).Note: This includes #2031 to actually test sync logic, will rebase to remove it after #2031 is merged.