-
Notifications
You must be signed in to change notification settings - Fork 37.7k
rpc, logging: return "verificationprogress" of 1 when up to date #31177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31177. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsNo conflicts as of last run. |
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, some ideas to encourage review:
- make sure your proposal compiles in your local environment, with passing unit and functional tests, so updating all the callers of
GuessVerificationProgress
. This makes it easier for reviewers to build and test your code and see if any test coverage needs to be updated or possibly added. Some reviewers may also wait until your pull has a green CI as enough proof of work before reviewing it more closely. - organize your commits into final form
- each separate commit needs to build cleanly with passing tests
- you might need a release note (see #31135)
- the debug logging is now changed as well, so perhaps prefix the PR title with
rpc, logging:
This article might be helpful.
3c9e3f1
to
95e6ae1
Compare
This comment was marked as outdated.
This comment was marked as outdated.
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
f49f8a4
to
22579a5
Compare
22579a5
to
6d877a1
Compare
@sipa and @laanwj pinging you here as this PR is based on your comments on #31135 This has been tested on mainnet and seems to be working with the solution proposed:
|
Thanks for updating -- will review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach ACK
src/validation.cpp
Outdated
//! require cs_main if pindex has not been validated yet (because m_chain_tx_count might be unset) | ||
double ChainstateManager::GuessVerificationProgress(const CBlockIndex* pindex) const | ||
{ | ||
const ChainTxData& data{GetParams().TxData()}; | ||
LOCK(::cs_main); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I suggest the following cleanup and locking improvement:
-
Use Clang thread safety annotation with
GuessVerificationProgress
to ensure ::cs_main is held by callers -
Drop unneeded lock already held by most callers
-
Remove redundant and out of date doxygen; the convention is to document the declaration
-
Move
const ChainTxData& data{GetParams().TxData()};
to where it is used
suggested diff
diff --git a/src/node/interfaces.cpp b/src/node/interfaces.cpp
index 73ce927f712..46fe9e10b0f 100644
--- a/src/node/interfaces.cpp
+++ b/src/node/interfaces.cpp
@@ -325,7 +325,8 @@ public:
}
double getVerificationProgress() override
{
- return chainman().GuessVerificationProgress(WITH_LOCK(chainman().GetMutex(), return chainman().ActiveChain().Tip()));
+ LOCK(chainman().GetMutex());
+ return chainman().GuessVerificationProgress(chainman().ActiveChain().Tip());
}
bool isInitialBlockDownload() override
{
@@ -409,7 +410,7 @@ public:
{
return MakeSignalHandler(::uiInterface.NotifyBlockTip_connect([fn, this](SynchronizationState sync_state, const CBlockIndex* block) {
fn(sync_state, BlockTip{block->nHeight, block->GetBlockTime(), block->GetBlockHash()},
- chainman().GuessVerificationProgress(block));
+ WITH_LOCK(chainman().GetMutex(), return chainman().GuessVerificationProgress(block)));
}));
}
std::unique_ptr<Handler> handleNotifyHeaderTip(NotifyHeaderTipFn fn) override
diff --git a/src/validation.cpp b/src/validation.cpp
index 5dca359aedf..54e85e07a6d 100644
--- a/src/validation.cpp
+++ b/src/validation.cpp
@@ -5613,12 +5613,8 @@ bool Chainstate::ResizeCoinsCaches(size_t coinstip_size, size_t coinsdb_size)
return ret;
}
-//! Guess how far we are in the verification process at the given block index and the best headers chain
-//! require cs_main if pindex has not been validated yet (because m_chain_tx_count might be unset)
double ChainstateManager::GuessVerificationProgress(const CBlockIndex* pindex) const
{
- const ChainTxData& data{GetParams().TxData()};
- LOCK(::cs_main);
if (pindex == nullptr) {
return 0.0;
}
@@ -5638,7 +5634,7 @@ double ChainstateManager::GuessVerificationProgress(const CBlockIndex* pindex) c
}
double fTxTotal;
-
+ const ChainTxData& data{GetParams().TxData()};
if (pindex->m_chain_tx_count <= data.tx_count) {
fTxTotal = data.tx_count + (end_of_chain_timestamp - data.nTime) * data.dTxRate;
} else {
diff --git a/src/validation.h b/src/validation.h
index 9e4fdbe6809..2223dffc249 100644
--- a/src/validation.h
+++ b/src/validation.h
@@ -1149,7 +1149,7 @@ public:
bool IsInitialBlockDownload() const;
/** Guess verification progress (as a fraction between 0.0=genesis and 1.0=current tip). */
- double GuessVerificationProgress(const CBlockIndex* pindex) const;
+ double GuessVerificationProgress(const CBlockIndex* pindex) const EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
/**
* Import blocks from an external file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for reviewing. Suggestions applied here ef15351.
Will squash once other reviews are finished.
src/validation.cpp
Outdated
int64_t end_of_chain_timestamp = pindex->GetBlockTime(); | ||
|
||
if (m_best_header && m_best_header->nChainWork > pindex->nChainWork) { | ||
int64_t header_age = time(nullptr) - m_best_header->GetBlockTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use our built-in time functionality in util/time.h
, which is type-safe (using the standard library's chrono types) and mockable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing.
I'm not sure about using it as requested. Is this what you're referring to? 317ddca
@@ -4732,7 +4732,6 @@ bool Chainstate::LoadChainTip() | |||
// Ignoring return value for now. | |||
(void)m_chainman.GetNotifications().blockTip(GetSynchronizationState(/*init=*/true, m_chainman.m_blockman.m_blockfiles_indexed), *pindex); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary empty line removal, please keep the changes contained to the scope of your PR
@@ -5629,16 +5625,21 @@ double ChainstateManager::GuessVerificationProgress(const CBlockIndex* pindex) c | |||
return 0.0; | |||
} | |||
|
|||
int64_t nNow = time(nullptr); | |||
int64_t end_of_chain_timestamp = TicksSinceEpoch<std::chrono::seconds>(NodeClock::time_point{std::chrono::seconds{pindex->GetBlockTime()}}); | |||
if (m_best_header && m_best_header->nChainWork > pindex->nChainWork) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a descriptive comment what this code fragment does and why; we know, but the code isn't super easy to follow if you don't know the context (so for future maintainers).
eg "Estimate the time of the most recent block. This is either the time of the block at the tip of the chain index, or, if we know of a recent enough header with more work than this block, the time of that header."
Edit: wait, this assumes that pindex
is the tip of the chain, but it's the block to estimate the progress of, which might not be the tip. i think i'm missing something. Won't this always return 1.0 if there is no good header known?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right, this seems like a rare edge case.
If I'm understanding correctly, the issue only arises if pindex
is an old block (not the actual tip) and we also have no good headers available. In that scenario, the function would return 1.0 due to a lack of visibility beyond pindex.
I think the only way this happens is if a node just reconnected after being offline or is somehow isolated from the network. In that case, it might briefly estimate 1.0 until it receives newer headers.
Not sure how problematic this is in practice, but a possible fix could be to use m_best_header
when available and, if not, fall back to Now<NodeSeconds>()
as an estimate of where the chain should be.
Could look something similar to this:
int64_t end_of_chain_timestamp = TicksSinceEpoch<std::chrono::seconds>(NodeClock::time_point{std::chrono::seconds{pindex->GetBlockTime()}});
if (m_best_header && m_best_header->nChainWork > pindex->nChainWork) {
int64_t header_age = TicksSinceEpoch<std::chrono::seconds>(Now<NodeSeconds>()) - m_best_header->GetBlockTime();
if (header_age < 24 * 60 * 60) {
end_of_chain_timestamp = std::max(end_of_chain_timestamp, m_best_header->GetBlockTime());
}
} else {
// If no good best header is available, estimate where the chain should be
end_of_chain_timestamp = std::max(end_of_chain_timestamp, TicksSinceEpoch<std::chrono::seconds>(Now<NodeSeconds>()));
}
@sipa tagging you since this was originally your proposal.
Edit: I think with this we will be again on the case where we never reach 1.0. Is it actually possible to reach 1.0 if we don't assume that pindex
is the tip if we don't know about any other best header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if pindex is an old block (not the actual tip) and we also have no good headers available. In that scenario, the function would return 1.0 due to a lack of visibility beyond pindex.
Maybe that's how it is used at the moment, but the documentation of the function implies that the estimate can be used on any block, not that pindex necessarily is the latest block.
I think the only way this happens is if a node just reconnected after being offline or is somehow isolated from the network. In that case, it might briefly estimate 1.0 until it receives newer headers.
That seems bad to me. Being isolated from the network shouldn't make it report values that far off! The old current-time-based estimate might have edge cases around 1, but apart from that it handles this better.
i think, if the purpose here is to essentially "round up the result a bit when almost 1 (and we're sure we're at the tip)" that's what we should do and not change the entire way estimates are done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't looked into this yet, but just finished a 3-week IBD. The past few days was switching between this branch at 6d877a1
and master, with stops and starts of several hours as internet access was intermittent, and this branch seemed to work well -- it returned the same value or very close each time. I did not see any 100% values before reaching the tip.
Blocks: 882404
Headers: 882404
Verification progress: 100.0000%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only way this happens is if a node just reconnected after being offline or is somehow isolated from the network. In that case, it might briefly estimate 1.0 until it receives newer headers.
That seems bad to me.
I agree. It doesn't seem worth it to fix one edge case and introduce another. This reminds me of #31135 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it returned the same value or very close each time. I did not see any 100% values before reaching the tip.
It makes sense bc your node is not isolated, but when you turn it on before having any peer I think it returns 1.0 and then it goes back to 0.x when it receives some header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe that's how it is used at the moment, but the documentation of the function implies that the estimate can be used on any block, not that pindex necessarily is the latest block.
As long as we have good headers using any block should not be a problem and should return a reasonable estimate. We could adjust it with something like this:
const CBlockIndex* active_tip = ActiveChain().Tip();
int64_t end_of_chain_timestamp = TicksSinceEpoch<std::chrono::seconds>(NodeClock::time_point{std::chrono::seconds{active_tip ? active_tip->GetBlockTime() : pindex->GetBlockTime()}});
But I'm quite sure that the behavior will be the same. If pindex
is not the ActiveChain().Tip()
at least the best header will be the tip one, right?
That seems bad to me. Being isolated from the network shouldn't make it report values that far off! The old current-time-based estimate might have edge cases around 1, but apart from that it handles this better.
A possible solution could be to keep the new estimation method, which allows us to reach 1.0 naturally without rounding. However, to handle cases where the node is isolated and lacks up-to-date headers, we could fall back to a time-based estimation when:
- we don't have good headers
pindex
is older than N hours (e.g. 2h)
This ensures that an isolated node doesn’t immediately assume it's fully synced but still allows reasonable progress estimation when there is no better data available.
(I implemented this already, will shut down a node and wait to be some blocks behind, will turn it on again and see if it shows 1.0. or estimates 0.x before being connected to any peer. I will edit this comment with the results)
@@ -1149,7 +1149,7 @@ class ChainstateManager | |||
bool IsInitialBlockDownload() const; | |||
|
|||
/** Guess verification progress (as a fraction between 0.0=genesis and 1.0=current tip). */ | |||
double GuessVerificationProgress(const CBlockIndex* pindex) const; | |||
double GuessVerificationProgress(const CBlockIndex* pindex) const EXCLUSIVE_LOCKS_REQUIRED(::cs_main); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you next update, please add the corresponding run-time lock assertion in the definition (see doc/developer-notes.md::L1003
) that I overlooked in my diff.
diff --git a/src/validation.cpp b/src/validation.cpp
index bee19f917ce..ccc05c8054d 100644
--- a/src/validation.cpp
+++ b/src/validation.cpp
@@ -5615,6 +5615,8 @@ bool Chainstate::ResizeCoinsCaches(size_t coinstip_size, size_t coinsdb_size)
double ChainstateManager::GuessVerificationProgress(const CBlockIndex* pindex) const
{
+ AssertLockHeld(::cs_main);
+
if (pindex == nullptr) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NoteToSelf: Check this before next push
Given that we do headers sync first these days, would it make sense to switch to reporting:
Normally, verificationprogress will be 1.0; dropping to ~0.9999988 briefly when a new header comes in, and headerrecency will tend to be in the ~10 minute range. OTOH, if you're in IBD, then verificationprogress will report how far through validating the blocks you have headers for; and if you've been disconnected from the network, verificationprogress will sit at 1.0 but headerrecency will tell you it's been days since you've seen a new header instead of minutes. When your node comes online after being down for a day, verificationprogress will initially be 1, but headerrecency will be 24 hours; as you sync headers, that will change to 0.9998 and ~10 minutes, then as you validate blocks it'll move to 1.0 and ~10 minutes. That seems a lot easier to calculate, and maybe also easier to interpret? |
I think the problem doing this is that not all blocks "are worth the same". So when doing IBD we would go fast to high % and then slow down a lot because newer blocks have more transactions. If we want the % to be accurate by the time we should still measure it by txs.
I like this approach and it's more or less what I'm trying to take into account in this other comment #31177 (comment) |
I think the suggestion is to keep the "tx weight" estimation, not change it. The suggestion is basically to split the "future headers estimation" from the verification progress estimation of a block in a known headers chain and return two floats. (I wanted to suggest the same, because I was certain that this was suggested earlier already, but I couldn't find an earlier mention, so I gave up) |
Sure, let's pretend that's what I said :) That would be:
|
On a second thought, the GUI uses the derivative (progress increase per hour), so whenever progress decreases, the GUI will calculate a negative increase. Then, likely the same people that complain about the progress not being 1 will complain about the increase being negative (or zero). I guess this scenario can already happen post-minchainwork-ibd, but it will probably happen more often when progress is just based on available headers, given that it is faster to download headers than to verify blocks (even early in ibd). So taking a step back, as the progress relies on the block time in the edge case scenario where users want the progress being reported as 1, and that block time is set by a miner and may be off anyway, I wonder if a simpler fix would be to just take a constant offset of the block time of a few hours. Obviously this fix is arbitrary, but it seems nice, because:
Also, I can't see any downsides? |
@maflcko I'm not sure I understand your proposal correctly. You're proposing this offset on the original function version or the "new" one? Would it get to 1.0 if we set a constant offset on all blocks? I think this idea is kinda similar to this one: #31177 (comment) but in this case I was thinking of applying it when the block time is far behind the local time. |
Just a simple one-line patch to offset |
I will close this PR for the moment as I don't think I will get enough ACKs with the current approach. |
Went ahead and just implemented that in #32528 |
getblockchaininfo
never reaches 1.0 as reported in issue #31127.This PR is based on the reviews given on #31135.