-
Notifications
You must be signed in to change notification settings - Fork 37.7k
sync: improve CCoinsViewCache ReallocateCache #28945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sync: improve CCoinsViewCache ReallocateCache #28945
Conversation
The short lived CCoinsViewCache's are destructed right after the Flush(), therefore it is not necessary to call ReallocateCache.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Hi, is this benchmark for HDD or SSD? Did you tried to run benchmark with huge dbcache and high height? |
I only measured it on an SSD, and only with relatively low dbcache size so flushing is triggered more often which I think would give more accurate results. But more benchmarks are definitely appreciated :) |
|
||
// After a flush all the cacheCoins's memory has now been freed. It is likely that the map gets full again so flush is necessary, and it will | ||
// happen around the same memory usage. So we can already reserve the bucket array to the previous size, reducing the number of rehashes needed. | ||
cacheCoins.reserve(current_bucket_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of ReallocateCache
originally was to balance the snapshot and background chainstate caches (see #18637). However, in 5e4ac5a this function was removed from ResizeCoinsCaches
and moved to Flush
. This created an implicit assumption that calling Flush
would clear the memory in the cache. This change violates that assumption, since the memory will no longer be clear and therefore ResizeCoinsCaches
will likely no longer respect the dbcache
limit during assumeutxo syncing.
Also see the docs for this function https://github.com/bitcoin/bitcoin/blob/master/src/coins.h#L336-L341. The purpose of this is to clear the memory of the cache.
Perhaps we could get the bucket count in Flush
before calling ReallocateCache
and reserve it in Flush
, then re-add ReallocateCache
to ResizeCoinsCaches
?
edit: I see that reserve
only allocates memory for the hashtable, and not the actual nodes stored in the unordered_map. I'm not sure how significant that extra memory is.
@@ -250,14 +250,16 @@ bool CCoinsViewCache::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlockIn | |||
return true; | |||
} | |||
|
|||
bool CCoinsViewCache::Flush() { | |||
bool CCoinsViewCache::Flush(bool reallocate_cache) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this parameter be added to the fuzz targets or unit tests?
Ran same hyperfine benchmark with base, 985ddd9, and head of branch. Got similar results 🚀.
It's unclear to me though why 5e4ac5a was necessary at all, and if reverting that change would be a cleaner fix for this instead?
Why is it necessary to free memory on |
🐙 This pull request conflicts with the target branch and needs rebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martinus Are you still working on this? If yes, it would be great if you could rebase and address @andrewtoth 's comment. If not, I can try to take this over and get it the attention it deserves.
@@ -307,7 +307,7 @@ class CCoinsViewCache : public CCoinsViewBacked | |||
* to be forgotten. | |||
* If false is returned, the state of this cache (and its backing view) will be undefined. | |||
*/ | |||
bool Flush(); | |||
bool Flush(bool reallocate_cache = true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably add something about this to the docstring.
@fjahr I'm not working on this any more, feel free to take over! |
TL;DR: this change improves sync time on my PC by ~6%.
While syncing when the CCoinsViewCache gets full it is flushed to disk and then memory is freed. Depending on available cache size, this happens several times. This PR removes the unnecessary reallocations for temporary CCoinsViewCache and reserves the cacheCoins's bucket array to its previous size.
I benchmarked the change on an AMD Ryzen 9 7950X with this command:
Where b5a2713 is mergebase, and ceeb718 this PR.
Which runs
./src/bitcoind -dbcache=500 -reindex-chainstate -printtoconsole=0 -stopatheight=500000
3 times and compares both commits, giving this result: