-
Notifications
You must be signed in to change notification settings - Fork 37.7k
refactor: Header sync optimisations & simplifications #32740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32740. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Maybe change the title to describe what's actually going on for those of us who don't memorise every pr number? :) ("Header sync optimisations/simplifications" ?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/headerssync.h
Outdated
@@ -140,33 +140,30 @@ class HeadersSyncState { | |||
|
|||
/** Result data structure for ProcessNextHeaders. */ | |||
struct ProcessingResult { | |||
std::vector<CBlockHeader> pow_validated_headers; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not thrilled by increasing the number of in/out parameters - it pushes readers towards having to lean on comments more, rather than having the code somewhat self-documenting. It also makes the unit tests more cumbersome. What makes the change somewhat acceptable is that it conforms to the style of net_processing.cpp.
If we keep this, I would also suggest further simplification of ProcessingResult
through converting it to an enum: f2a6179
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for dropping the commit introducing the in/out parameter!
13e123e
to
7634d81
Compare
@hodlinator thank you for your review! I like the idea of converting As for Latest push:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to review this - can you tell me how to measure the effect of the "optimizations" using either the microbenchmarks or a reindex or anything that shows the magnitude of change?
Would be helpful if you could publish your measurements if you have any.
7634d81
to
f4b33a1
Compare
Last push:
I don't have any, working on it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, I'm not sure about the first change but the rest make sense.
reviewed f4b33a1, left some comments for most commits, my main concern is the commit where we're reusing the headers vector, making the code significantly more contrived.
It would also be great if we could extract the common parts of this and @hodlinator's change and merge that (or his change) before we do this.
@@ -4365,7 +4365,7 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type, | |||
MaybeSendGetHeaders(pfrom, GetLocator(m_chainman.m_best_header), *peer); | |||
} | |||
return; | |||
} else if (prev_block->nChainWork + CalculateClaimedHeadersWork({{cmpctblock.header}}) < GetAntiDoSWorkThreshold()) { | |||
} else if (prev_block->nChainWork + GetBlockProof(cmpctblock.header) < GetAntiDoSWorkThreshold()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
After some thinking, I decided to drop the first commit, and let ProcessNextHeaders take the headers argument and return it, instead of using it as a I/O argument. I tried to write a benchmark to measure how much using the vector as I/O argument would optimize the code, but it convinced me that the new interface would be too hard to work with, as reviewers pointed out. While it might still be a performance improvement, I'm not sure if it's really worth it. For anyone interested, here's my attempt at writing a benchmark: 852aca5 and my branch Will update the PR in the next few days :) |
4d95119
to
f33bd2f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my last push (f33bd2f):
- Removed first commit as per my last comment (#32740 (comment))
- Addressed review comments
src/validation.h
Outdated
@@ -394,7 +394,7 @@ bool TestBlockValidity(BlockValidationState& state, | |||
bool fCheckMerkleRoot = true) EXCLUSIVE_LOCKS_REQUIRED(cs_main); | |||
|
|||
/** Check with the proof of work on each blockheader matches the value in nBits */ | |||
bool HasValidProofOfWork(const std::vector<CBlockHeader>& headers, const Consensus::Params& consensusParams); | |||
bool HasValidProofOfWork(const std::span<const CBlockHeader> headers, const Consensus::Params& consensusParams); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense, thank you!
I only removed the const, I prefer to keep the argument names
src/validation.cpp
Outdated
@@ -4159,9 +4159,9 @@ std::vector<unsigned char> ChainstateManager::GenerateCoinbaseCommitment(CBlock& | |||
return commitment; | |||
} | |||
|
|||
bool HasValidProofOfWork(const std::vector<CBlockHeader>& headers, const Consensus::Params& consensusParams) | |||
bool HasValidProofOfWork(const std::span<const CBlockHeader> headers, const Consensus::Params& consensusParams) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I removed the const and put this change in a separate commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code review ACK f33bd2f
nit: the PR isn't really an optimization anymore, consider updating the title
@@ -4149,8 +4149,8 @@ std::vector<unsigned char> ChainstateManager::GenerateCoinbaseCommitment(CBlock& | |||
|
|||
bool HasValidProofOfWork(const std::vector<CBlockHeader>& headers, const Consensus::Params& consensusParams) | |||
{ | |||
return std::all_of(headers.cbegin(), headers.cend(), | |||
[&](const auto& header) { return CheckProofOfWork(header.GetHash(), header.nBits, consensusParams);}); | |||
return std::ranges::all_of(headers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this change could either be merged with the vector-to-std::span or moved over to that commit, it's not needed in 80c63c9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see it belonging in either commit, as one modifies the container and the other modifies what operators we apply to it.
Edit after #32740 (comment) below: Now realize I was mixing up CheckProofOfWork
with GetBlockProof
- we are still applying the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it fits slightly better with the span change... since in c++20 span doesn't have a cbegin
method, we needed to change the initial code, and update to ranges::all_of
@@ -101,18 +101,6 @@ class CBlock : public CBlockHeader | |||
m_checked_merkle_root = false; | |||
} | |||
|
|||
CBlockHeader GetBlockHeader() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
src/headerssync.cpp
Outdated
m_current_chain_work(chain_start->nChainWork), | ||
m_last_header_received(m_chain_start->GetBlockHeader()), | ||
m_current_height(chain_start->nHeight) | ||
const CBlockIndex& chain_start, const arith_uint256& minimum_required_work) : m_commit_offset(FastRandomContext().randrange<unsigned>(HEADER_COMMITMENT_PERIOD)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if you touch again, please format this differently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree using style much closer to the original would be very welcome. Listing the member initializers at the indentation level where the function arguments end makes potential comments to the right of those arguments disappear out of sight etc.
Would also be nice to use more common style for CompressedHeader
but at least there the initializer list didn't exist before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I didn't like the initial formatting either, but I thought it was common (looking at other constructors in the repository, no, it's not...)
I updated the formatting both in HeadersSyncState
and CompressedHeaders
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Still have a slight preference for using the original way of formatting the HeadersSyncState
initializer-list for optimal git blame
integrity, but this is tolerable compared to the previous push, I see it matches PeerManagerImpl
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still keep the reformatting as clang-format
doesn't like the one on master, and reformats it as I previously did (with the member initializers pushed to the right). At least this reformat is nice and clang-format
doesn't try to modify it.
I hadn't considered git blame
though, I don't love the idea of polluting it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so clang-format
was pushing for it... seems it could do with some reconfiguration in the initializer-list aspect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After #32813 we could customize the list formatting as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed f33bd2f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooops, not sure what happened there, thanks for noticing!
src/headerssync.h
Outdated
@@ -140,33 +140,30 @@ class HeadersSyncState { | |||
|
|||
/** Result data structure for ProcessNextHeaders. */ | |||
struct ProcessingResult { | |||
std::vector<CBlockHeader> pow_validated_headers; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for dropping the commit introducing the in/out parameter!
@@ -4149,8 +4149,8 @@ std::vector<unsigned char> ChainstateManager::GenerateCoinbaseCommitment(CBlock& | |||
|
|||
bool HasValidProofOfWork(const std::vector<CBlockHeader>& headers, const Consensus::Params& consensusParams) | |||
{ | |||
return std::all_of(headers.cbegin(), headers.cend(), | |||
[&](const auto& header) { return CheckProofOfWork(header.GetHash(), header.nBits, consensusParams);}); | |||
return std::ranges::all_of(headers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see it belonging in either commit, as one modifies the container and the other modifies what operators we apply to it.
Edit after #32740 (comment) below: Now realize I was mixing up CheckProofOfWork
with GetBlockProof
- we are still applying the same thing.
src/headerssync.cpp
Outdated
m_current_chain_work(chain_start->nChainWork), | ||
m_last_header_received(m_chain_start->GetBlockHeader()), | ||
m_current_height(chain_start->nHeight) | ||
const CBlockIndex& chain_start, const arith_uint256& minimum_required_work) : m_commit_offset(FastRandomContext().randrange<unsigned>(HEADER_COMMITMENT_PERIOD)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree using style much closer to the original would be very welcome. Listing the member initializers at the indentation level where the function arguments end makes potential comments to the right of those arguments disappear out of sight etc.
Would also be nice to use more common style for CompressedHeader
but at least there the initializer list didn't exist before.
src/validation.h
Outdated
@@ -393,7 +393,7 @@ bool TestBlockValidity(BlockValidationState& state, | |||
bool fCheckMerkleRoot = true) EXCLUSIVE_LOCKS_REQUIRED(cs_main); | |||
|
|||
/** Check with the proof of work on each blockheader matches the value in nBits */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Could fix typo as suggested by LLM (#32740 (comment)).
/** Check with the proof of work on each blockheader matches the value in nBits */ | |
/** Check that the proof of work on each blockheader matches the value in nBits */ |
Just don't call this function when it won't have any effect.
Avoid the need to construct a CBlockIndex object just to compute work for a header, when its nBits value suffices for that. Co-Authored-By: Pieter Wuille <pieter@wuille.net>
Co-Authored-By: Pieter Wuille <pieter@wuille.net>
There is no need for a function to convert a CBlock to a CBlockHeader, as it's a child class of it.
No need to pass consensusParams, as CheckHeadersPow already has access to m_chainparams.GetConsensus() Co-Authored-By: maflcko <6399679+maflcko@users.noreply.github.com>
Co-Authored-By: Aurèle Oulès <aurele@oules.com>
Also mark GetFullHeader as const Co-Authored-By: Aurèle Oulès <aurele@oules.com>
chain_start can never be null, so it's better to pass it as a reference rather than a raw pointer Co-Authored-By: maflcko <6399679+maflcko@users.noreply.github.com>
f33bd2f
to
50e61f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews!
In my last push:
- Rebased onto the right commit 😅 (see #32740 (comment))
- Addressed review comments
@@ -4149,8 +4149,8 @@ std::vector<unsigned char> ChainstateManager::GenerateCoinbaseCommitment(CBlock& | |||
|
|||
bool HasValidProofOfWork(const std::vector<CBlockHeader>& headers, const Consensus::Params& consensusParams) | |||
{ | |||
return std::all_of(headers.cbegin(), headers.cend(), | |||
[&](const auto& header) { return CheckProofOfWork(header.GetHash(), header.nBits, consensusParams);}); | |||
return std::ranges::all_of(headers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it fits slightly better with the span change... since in c++20 span doesn't have a cbegin
method, we needed to change the initial code, and update to ranges::all_of
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooops, not sure what happened there, thanks for noticing!
src/headerssync.cpp
Outdated
m_current_chain_work(chain_start->nChainWork), | ||
m_last_header_received(m_chain_start->GetBlockHeader()), | ||
m_current_height(chain_start->nHeight) | ||
const CBlockIndex& chain_start, const arith_uint256& minimum_required_work) : m_commit_offset(FastRandomContext().randrange<unsigned>(HEADER_COMMITMENT_PERIOD)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I didn't like the initial formatting either, but I thought it was common (looking at other constructors in the repository, no, it's not...)
I updated the formatting both in HeadersSyncState
and CompressedHeaders
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 50e61f4
PR improves the code in several minor ways, such as:
- Computing PoW for a
CBlockHeader
(orCBlock
) without having to jump through temporarily constructedCBlockIndex
(552b9c3). - Removing
CBlock::GetBlockHeader()
leaves less room for forgetting to properly set fields in the newCBlockHeader
instance (67aa62c). - Changes
const CBlockIndex* HeadersSyncState::m_chain_start
to a reference as it is never null and never changed (50e61f4).
src/headerssync.cpp
Outdated
m_current_chain_work(chain_start->nChainWork), | ||
m_last_header_received(m_chain_start->GetBlockHeader()), | ||
m_current_height(chain_start->nHeight) | ||
const CBlockIndex& chain_start, const arith_uint256& minimum_required_work) : m_commit_offset(FastRandomContext().randrange<unsigned>(HEADER_COMMITMENT_PERIOD)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Still have a slight preference for using the original way of formatting the HeadersSyncState
initializer-list for optimal git blame
integrity, but this is tolerable compared to the previous push, I see it matches PeerManagerImpl
).
recrACK 50e61f4 I only left nits, I'm fine with merging it as is. |
if (IsAncestorOfBestHeaderOrTip(last_received_header)) { | ||
already_validated_work = true; | ||
} | ||
already_validated_work = already_validated_work || IsAncestorOfBestHeaderOrTip(last_received_header); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if you touch again, you could explain in the commit message why the LookupBlockIndex
cannot be eliminated as well
@@ -31,16 +31,17 @@ struct CompressedHeader { | |||
hashMerkleRoot.SetNull(); | |||
} | |||
|
|||
CompressedHeader(const CBlockHeader& header) | |||
explicit CompressedHeader(const CBlockHeader& header) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if touching again, the commit message should probably mention the explicit
as well. No that important, just noticed it.
@@ -41,7 +44,7 @@ HeadersSyncState::HeadersSyncState(NodeId id, const Consensus::Params& consensus | |||
// exceeds this bound, because it's not possible for a consensus-valid | |||
// chain to be longer than this (at the current time -- in the future we | |||
// could try again, if necessary, to sync a longer chain). | |||
m_max_commitments = 6*(Ticks<std::chrono::seconds>(NodeClock::now() - NodeSeconds{std::chrono::seconds{chain_start->GetMedianTimePast()}}) + MAX_FUTURE_BLOCK_TIME) / HEADER_COMMITMENT_PERIOD; | |||
m_max_commitments = 6 * (Ticks<std::chrono::seconds>(NodeClock::now() - NodeSeconds{std::chrono::seconds{chain_start.GetMedianTimePast()}}) + MAX_FUTURE_BLOCK_TIME) / HEADER_COMMITMENT_PERIOD; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: as mentioned this should be synchronized with the other PR to simplify the merge conflicts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to put this in draft and wait for #32579 to be merged first... This PR is smaller, and probably easier to rebase. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally would let the maintainers decide the order, ank keep both open
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't mind the order much, especially now that it seems like neither will be in v30. Thanks for asking!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another super-nit: CheckHeadersPow
-> CheckHeadersPoW
in commit message. Only change if you modify the PR for any other reason.
This is a partial* revival of #25968
It contains a list of most-unrelated simplifications and optimizations to the code merged in #25717:
It also contains the following code cleanups, which were suggested by reviewers in #25968:
*I decided to leave out three commits that were in #25968 (4e7ac7b, ab52fb4, 7f1cf44), since they're a bit more involved, and I'm a new contributor. If this PR gets merged, I'll comment under #25968 to note that these three commits are still up for grabs :)