Skip to content

Conversation

sipa
Copy link
Member

@sipa sipa commented Jan 21, 2025

Since #28280 and #28233, the percentage of dirty entries in the cache when a flush happen may in practice be far from 100%. The log output, the decision to warn about a large flush (#31534), and the disk-full detection, however always use the entire size of the cache.

Fix this by keeping track of the number of dirty entries in CCoinsViewCache, and using that number. I've dropped the usage of DynamicMemoryUsage as it's the wrong metric, even before the non-wiping flushes were introduced (if everything is dirty, memory usage is correlated with, but not the same as, the amount of disk space written or used). They could be improved by keeping track of proper write size estimates in a follow-up, but here I'm just using a fixed 48 bytes per entry estimate.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 21, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31703.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK TheCharlatan, andrewtoth, l0rinc, tdb3

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #32313 (coins: fix cachedCoinsUsage accounting in CCoinsViewCache by l0rinc)
  • #32128 (Draft: CCoinMap Experiments by martinus)
  • #31132 (validation: fetch block inputs on parallel threads 10% faster IBD by andrewtoth)
  • #30611 (validation: write chainstate to disk every hour by andrewtoth)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@sipa sipa changed the title 202501 dirty coin count Use number of dirty cache entries in flush warnings/logs Jan 21, 2025
@TheCharlatan
Copy link
Contributor

Nice, Concept ACK

Copy link
Contributor

@andrewtoth andrewtoth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

@sipa sipa force-pushed the 202501_dirty_coin_count branch from 19f1f69 to 4d7c848 Compare January 21, 2025 20:52
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Debug: https://github.com/bitcoin/bitcoin/runs/35957028069

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

@sipa sipa force-pushed the 202501_dirty_coin_count branch from 4d7c848 to d99b0ec Compare January 22, 2025 04:21
Copy link
Contributor

@l0rinc l0rinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

I have tried reproducing it via AssumeUTXO but the warning is never triggered.
That's my main concern (it's probably my mistake in the related PR), other than not applying it in other cases where the sizes are displayed.

// Typical Coin structures on disk are around 48 bytes in size.
size_t approx_disk_usage = 48 * coins_dirty_count;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that the hard-coded value was here before, and was wondering if we could explain how we got to this magic number (and not mention the value in comments at all since code can express that better), maybe something like:

static constexpr size_t COIN_DISK_USAGE_BYTES{sizeof(Coin)}; // Approximate size on disk

Loading the UTXO set to memory:

2025-01-22T11:56:22Z [snapshot] loaded 176948713 (27440 MB) coins from snapshot 0000000000000000000320283a032748cef8227873ff4872689bf23f1cda83a5

and checking the resulting disk size:

du -sk /Users/lorinc/IdeaProjects/bitcoin/demo/chainstate_snapshot | cut -f1
11524472

Suggests we may want to add a ~40% overhead here:

((11524472*1024)/176948713)/48

i.e.

static constexpr size_t COIN_DISK_USAGE_BYTES{static_cast<size_t>(sizeof(Coin) * 1.4)}; // Approximate size on disk

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Estimating sizes of the UTXO set is hard, because these are all very distinct numbers:

  • The size on disk in a snapshot of a UTXO set
  • The size on disk in the chainstate LevelDB database (the database has different per-entry overhead, prefix compression, bloom filters, ...).
  • The amount of data that will be written to disk to flush a number of dirty entries (because e.g. writing a removal/spending of a UTXO temporarily takes up space in the log before it's compacted into the tables)
  • The amount of memory a UTXO entry takes up in RAM (as approximated by DynamicMemoryUsage()).

If we actually care about disk sizes, we should actually try to estimate the thing we care about, and not use snapshot size as a proxy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that it's complex, but we seem to be off here by 40% - I don't mind if we tackle that in a separate PR/issue and would appreciate your opinion on how to handle this better.

Copy link
Member Author

@sipa sipa Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why you're looking at snapshot size at all, that's not a good measure of data written. We may be off by a lot more, or a lot less. If we want to do better, we should look at how much is actually written, and then add to CCoinsViewCache a way to predict that number. None of that seems a good fit for this PR, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why you're looking at snapshot size

not a good measure of data written

and not use snapshot size as a proxy

I understand that the dirty entries in the snapshot are not a perfect representation (except for the AssumeUTXO tests I was doing, which likely skewed my recommendations), but I though it's what we're already doing here when we're calculating "approx_disk_usage" from "coins_dirty_count".

None of that seems a good fit for this PR, though

I agree, if we want a better representation. But replacing the magic 48 with sizeof(Coin) could be. But I'm also fine if not, it's just an alternative that I found.

src/coins.cpp Outdated
@@ -235,6 +244,7 @@ bool CCoinsViewCache::BatchWrite(CoinsViewCacheCursor& cursor, const uint256 &ha
itUs->second.coin = it->second.coin;
}
cachedCoinsUsage += itUs->second.coin.DynamicMemoryUsage();
m_num_dirty += !itUs->second.IsDirty();
Copy link
Contributor

@l0rinc l0rinc Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a while to understand this one - still not sure I follow completely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworked it a bit. LMK if it's clearer now.

Copy link
Contributor

@l0rinc l0rinc Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you.
The only remaining nit is that in other cases we set it dirty first, before incrementing (edit: to obviate that the call doesn't rely on the value of m_dirty_count).

src/coins.cpp Outdated
@@ -78,6 +78,7 @@ void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possi
bool fresh = false;
if (!inserted) {
cachedCoinsUsage -= it->second.coin.DynamicMemoryUsage();
m_num_dirty -= it->second.IsDirty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're modifying m_num_dirty multiple times in the AddCoin method, possibly even cancelling out previous modifications. If it's not too involved, could we calculate the final diff and only update it a single time in this method for clarity? Same applies to SpendCoin.

src/coins.cpp Outdated
@@ -343,6 +359,7 @@ void CCoinsViewCache::SanityCheck() const
}
assert(count_linked == count_flagged);
assert(recomputed_usage == cachedCoinsUsage);
assert(num_dirty == m_num_dirty);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we merge I'd like to run a full IBD with SanityCheck enabled throughout the code - can you recommend other settings that I should enable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything that involves a reorg would perhaps be interesting to test.

Copy link
Contributor

@l0rinc l0rinc Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran an IBD until 800k with -DCMAKE_BUILD_TYPE=Debug for the current PR (it's really slow this way) - no issues found.

a reorg would perhaps be interesting to test

Do we have a way of triggering random reorgs during IBD (at least undo/redo style)?
If not, would it be helpful to create a PR that adds a hidden -simulatereorgonibd argument (similarly to -dbcrashratio) that randomly triggers small, temporary reorgs during IBD on mainnet? Specifically, at a random height between every ~[10,000, 100,000] blocks, we disconnect and reconnect ~[1, 100] blocks (favoring smaller counts), performing a CoinsTip#SanityCheck() before and after (and maybe validating that we have the UTXOs in cache that will be removed, asserting after undo that they're missing, and revalidating after reapplication that they're back in cache). This would test reorg logic in a semi-realistic scenario - though it wouldn't stress diverging chains, as it reconnects the same blocks.

Copy link
Contributor

@l0rinc l0rinc Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could collect a few hundred stale blocks (found 219 raw blocks so far via https://github.com/bitcoin-data/stale-blocks and https://learnmeabitcoin.com) and create a stale-block-proxy node for end-to-end regression testing of reorgs with historical data:

  • A freshly built test node would connect only to this proxy, which would simulate the original chain forks.
  • It serves headers & blocks exactly as they were originally seen (without storing them, requesting non-stale ones from the network).
  • The proxy pretends a stale block is the tip, letting the test node sync up to it.
  • Once the test node reaches that stale block, the proxy sets the next stale block as the new tip, triggering a natural reorg.

Would that work and be a useful way to test reorg behavior during IBD?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do this later to test reorg behavior in a functional test, you can resolve this comment.

// Typical Coin structures on disk are around 48 bytes in size.
size_t approx_disk_usage = 48 * coins_dirty_count;
if (approx_disk_usage >= WARN_FLUSH_COINS_SIZE) LogWarning("Flushing large (%.1f GiB) UTXO set to disk, it may take several minutes", double(approx_disk_usage) / (1 << 30));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is likely my mistake in #31534, but loading the UTXO set (as described in #31645) will call Flush via FlushSnapshotToDisk at the end (taking a really long time but never warning) https://github.com/bitcoin/bitcoin/blob/master/src/validation.cpp#L5878, reset m_num_dirty = 0 so when we get to FlushStateToDisk (through ActivateSnapshot) we're never warning since coins_dirty_count is always 0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the snapshot code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I posted the steps to reproduce it at #31645 - but we can move this warning to Flush in a different PR to make sure it works for AssumeUTXO as well.


Checked that the current PR displays the warning correctly.

2025-01-22T19:20:54Z UpdateTip: new best=0000000000000000000162ce2492545600d6e6780ee22bf464166a4faa28ba28 height=824665 version=0x32000000 log2_work=94.650129 tx=948674213 date='2024-01-06T21:32:54Z' progress=0.822859 cache=9687.2MiB(67719102txo)
...
2025-01-22T19:20:55Z [warning] Flushing large (67721262 entries) UTXO set to disk, it may take several minutes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the snapshot code.

It seems to me that it may make more sense to move the warning to CCoinsViewDB::BatchWrite instead, which could already have m_dirty available through the CoinsViewCacheCursor, and since it's called from Flush and Sync, an AssumeUTXO load-and-dump would also trigger the warning (not just an IBD).

I also don't mind doing it myself in a follow-up, if you think it's a good idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, currently AssumeUTXO doesn't display the warning:

2025-03-18T10:08:47Z [snapshot] 184000000 coins loaded (99.56%, 29332 MB)
2025-03-18T10:08:48Z [snapshot] loaded 184821030 (29456 MB) coins from snapshot 000000000000000000010b17283c3c400507969a9c2afd1dcf2082ec5cca2880
2025-03-18T10:08:48Z FlushSnapshotToDisk: saving snapshot chainstate (29456 MB) started
2025-03-18T10:09:57Z FlushSnapshotToDisk: completed (69461.66ms)

But if we moved the warning introduced in #31534 into to the method that does the actual long operation it would show the warning for AssumeUTXO case as well (i.e. both for Flush and Sync):

diff --git a/src/coins.h b/src/coins.h
index f897bce749..a23b663a1a 100644
--- a/src/coins.h
+++ b/src/coins.h
@@ -300,6 +300,7 @@ struct CoinsViewCacheCursor
     }
 
     inline bool WillErase(CoinsCachePair& current) const noexcept { return m_will_erase || current.second.coin.IsSpent(); }
+    size_t GetDirtyCount() const noexcept { return m_dirty; }
 private:
     size_t& m_usage;
     size_t& m_dirty;
diff --git a/src/txdb.cpp b/src/txdb.cpp
index 1622039d63..083cab3279 100644
--- a/src/txdb.cpp
+++ b/src/txdb.cpp
@@ -25,6 +25,9 @@ static constexpr uint8_t DB_HEAD_BLOCKS{'H'};
 // Keys used in previous version that might still be found in the DB:
 static constexpr uint8_t DB_COINS{'c'};
 
+// Threshold for warning when writing this many dirty cache entries to disk.
+static constexpr size_t WARN_FLUSH_COINS_COUNT{10'000'000};
+
 bool CCoinsViewDB::NeedsUpgrade()
 {
     std::unique_ptr<CDBIterator> cursor{m_db->NewIterator()};
@@ -109,6 +112,8 @@ bool CCoinsViewDB::BatchWrite(CoinsViewCacheCursor& cursor, const uint256 &hashB
         }
     }
 
+    if (cursor.GetDirtyCount() > WARN_FLUSH_COINS_COUNT) LogWarning("Flushing large (%d entries) UTXO set to disk, it may take several minutes", cursor.GetDirtyCount());
+
     // In the first batch, mark the database as being in the middle of a
     // transition from old_tip to hashBlock.
     // A vector is used for future extensibility, as we may want to support
diff --git a/src/validation.cpp b/src/validation.cpp
index 1f4f2e1ae8..3323a037dc 100644
--- a/src/validation.cpp
+++ b/src/validation.cpp
@@ -88,8 +88,6 @@ using node::CBlockIndexHeightOnlyComparator;
 using node::CBlockIndexWorkComparator;
 using node::SnapshotMetadata;
 
-/** Threshold for warning when writing this many dirty cache entries to disk. */
-static constexpr size_t WARN_FLUSH_COINS_COUNT = 10'000'000;
 /** Time to wait between writing blocks/block index to disk. */
 static constexpr std::chrono::hours DATABASE_WRITE_INTERVAL{1};
 /** Time to wait between flushing chainstate to disk. */
@@ -2932,7 +2930,6 @@ bool Chainstate::FlushStateToDisk(
         }
         // Flush best chain related state. This can only be done if the blocks / block index write was also done.
         if (fDoFullFlush && !CoinsTip().GetBestBlock().IsNull()) {
-            if (coins_dirty_count >= WARN_FLUSH_COINS_COUNT) LogWarning("Flushing large (%d entries) UTXO set to disk, it may take several minutes", coins_dirty_count);
             LOG_TIME_MILLIS_WITH_CATEGORY(strprintf("write coins cache to disk (%d out of %d cached coins)",
                 coins_dirty_count, coins_count), BCLog::BENCH);

Reproducer:

cmake -B build -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc)
mkdir -p demo && rm -rfd demo/chainstate demo/chainstate_snapshot demo/debug.log
build/bin/bitcoind -datadir=demo -stopatheight=1
build/bin/bitcoind -datadir=demo -daemon -blocksonly=1 -connect=0 -dbcache=30000
build/bin/bitcoin-cli -datadir=demo -rpcclienttimeout=0 loadtxoutset ~/utxo-880000.dat
build/bin/bitcoin-cli -datadir=demo stop

Result:

2025-03-18T09:59:40Z [snapshot] 184000000 coins loaded (99.56%, 29332 MB)
2025-03-18T09:59:41Z [snapshot] loaded 184821030 (29456 MB) coins from snapshot 000000000000000000010b17283c3c400507969a9c2afd1dcf2082ec5cca2880
2025-03-18T09:59:41Z FlushSnapshotToDisk: saving snapshot chainstate (29456 MB) started
*2025-03-18T09:59:41Z [warning] Flushing large (184821030 entries) UTXO set to disk, it may take several minutes*
2025-03-18T10:00:50Z FlushSnapshotToDisk: completed (68776.27ms)

Would it be simpler if I pushed this change in a separate PR instead?

@@ -2828,7 +2828,8 @@ bool Chainstate::FlushStateToDisk(
bool full_flush_completed = false;

const size_t coins_count = CoinsTip().GetCacheSize();
const size_t coins_mem_usage = CoinsTip().DynamicMemoryUsage();
[[maybe_unused]] const size_t coins_mem_usage = CoinsTip().DynamicMemoryUsage();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can't just inline this, but maybe we can guard it with #ifdef ENABLE_TRACING alternatively

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think [[maybe_unused]] is fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, though this won't warn us if it actually becomes unused.

@sipa sipa force-pushed the 202501_dirty_coin_count branch 2 times, most recently from 56fef1e to 1c89256 Compare January 22, 2025 15:17
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Debug: https://github.com/bitcoin/bitcoin/runs/36002777403

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

@sipa sipa force-pushed the 202501_dirty_coin_count branch from 1c89256 to 15619a1 Compare January 22, 2025 19:15
Copy link
Contributor

@tdb3 tdb3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK for now.
Makes sense to ensure warnings use a more accurate data point (m_dirty_count). Helps prevent a "chicken little" effect.

Copy link
Contributor

@l0rinc l0rinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed it in multiple steps, I have a few remaining questions and concerns, let me know if I can help in any other way.

@@ -100,6 +100,7 @@ class CCoinsViewCacheTest : public CCoinsViewCache
CCoinsMap& map() const { return cacheCoins; }
CoinsCachePair& sentinel() const { return m_sentinel; }
size_t& usage() const { return cachedCoinsUsage; }
size_t& GetDirtyCount() const { return m_dirty_count; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now hides GetDirtyCount in CCoinsViewCache with a size_t return value (changed to size_t& here), used later as cache.usage() += and cache.GetDirtyCount() +=.

To avoid the confusing override and the method return value assignment, please consider making them explicit setters/adders:

void AddUsage(size_t usage) const { cachedCoinsUsage += usage; }
void AddDirtyCount(size_t count) const { m_dirty_count += count; }
cache.AddUsage(InsertCoinsMapEntry(cache.map(), cache.sentinel(), *cache_coin));
cache.AddDirtyCount(cache_coin->IsDirty());

/** Size threshold for warning about slow UTXO set flush to disk. */
static constexpr size_t WARN_FLUSH_COINS_SIZE = 1 << 30; // 1 GiB
/** Threshold for warning when writing this many dirty cache entries to disk. */
static constexpr size_t WARN_FLUSH_COINS_COUNT = 10'000'000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 10m seems like a good approximation for the previous value:

2025-01-31T09:45:42Z UpdateTip: new best=0000000000000000cc22cc938f92ecc3a3844f23775c4ad4e66404563c8594c8 height=287000 version=0x00000002 log2_work=76.752105 tx=33340892 date='2014-02-2
1T05:18:42Z' progress=0.028794 cache=1245.8MiB(10004443txo)

@@ -113,7 +115,10 @@ void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possi
void CCoinsViewCache::EmplaceCoinInternalDANGER(COutPoint&& outpoint, Coin&& coin) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change doesn't seem to be covered by coins_tests at all.
Can we extend SimulationTest to call stack.back()->EmplaceCoinInternalDANGER instead of AddCoin randomly?

Maybe something like:

if (COutPoint op(txid, 0); !stack.back()->map().contains(op) && !(coin.IsSpent() || (m_rng.rand32() & 1))) {
    stack.back()->EmplaceCoinInternalDANGER(std::move(op), std::move(newcoin));
} else {
    stack.back()->AddCoin(std::move(op), std::move(newcoin), /*possible_overwrite=*/(!coin.IsSpent() || (m_rng.rand32() & 1)));
}

@@ -113,7 +115,10 @@ void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possi
void CCoinsViewCache::EmplaceCoinInternalDANGER(COutPoint&& outpoint, Coin&& coin) {
cachedCoinsUsage += coin.DynamicMemoryUsage();
auto [it, inserted] = cacheCoins.try_emplace(std::move(outpoint), std::move(coin));
if (inserted) CCoinsCacheEntry::SetDirty(*it, m_sentinel);
if (inserted) {
CCoinsCacheEntry::SetDirty(*it, m_sentinel);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation of EmplaceCoinInternalDANGER claims that:

Emplace a coin into cacheCoins without performing any checks, marking the emplaced coin as dirty.

How come we're updating cachedCoinsUsage unconditionally in this method - but m_dirty_count only when inserted?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bug from #28280. It should be done conditionally. I fix it in the first commit of #31132. The only caller of this method will not add coins that already exist in the cache though, so it will always enter the if (inserted) block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for validating it, I only noticed it now, when trying to test EmplaceCoinInternalDANGER in SimulationTest

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following @andrewtoth's solution we can do:

void CCoinsViewCache::EmplaceCoinInternalDANGER(COutPoint&& outpoint, Coin&& coin) {
    const auto mem_usage{coin.DynamicMemoryUsage()};
    auto [it, inserted] = cacheCoins.try_emplace(std::move(outpoint), std::move(coin));
    if (inserted) {
        cachedCoinsUsage += mem_usage;
        CCoinsCacheEntry::SetDirty(*it, m_sentinel);
        ++m_dirty_count;
    }
}

Copy link
Contributor

@l0rinc l0rinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After our last discussion (and with the new AssumeUTXO included) I've rebased and re-reviewed and added examples to my suggestions to simplify the review.

My remaining concerns are that AssumeUTXO doesn't warn in the same way as other dumps do, a few leftover refactoring misses, redundant calculation now that we have a dirty count, missing code coverage for modified code, a leftover memory accounting bug in affected area, and a few code nits to make the change less surprising.

Suggestions
diff --git a/src/coins.cpp b/src/coins.cpp
index 30d82e1d08..b126622490 100644
--- a/src/coins.cpp
+++ b/src/coins.cpp
@@ -113,10 +113,11 @@ void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possi
 }
 
 void CCoinsViewCache::EmplaceCoinInternalDANGER(COutPoint&& outpoint, Coin&& coin) {
-    cachedCoinsUsage += coin.DynamicMemoryUsage();
+    const auto mem_usage{coin.DynamicMemoryUsage()};
     auto [it, inserted] = cacheCoins.try_emplace(std::move(outpoint), std::move(coin));
     if (inserted) {
         CCoinsCacheEntry::SetDirty(*it, m_sentinel);
+        cachedCoinsUsage += mem_usage;
         ++m_dirty_count;
     }
 }
@@ -245,8 +246,8 @@ bool CCoinsViewCache::BatchWrite(CoinsViewCacheCursor& cursor, const uint256 &ha
                 }
                 cachedCoinsUsage += itUs->second.coin.DynamicMemoryUsage();
                 if (!itUs->second.IsDirty()) {
-                    ++m_dirty_count;
                     CCoinsCacheEntry::SetDirty(*itUs, m_sentinel);
+                    ++m_dirty_count;
                 }
                 // NOTE: It isn't safe to mark the coin as FRESH in the parent
                 // cache. If it already existed and was spent in the parent
@@ -342,7 +343,7 @@ void CCoinsViewCache::SanityCheck() const
         // Recompute cachedCoinsUsage.
         recomputed_usage += entry.coin.DynamicMemoryUsage();
 
-        // Recompute m_num_dirty;
+        // Recompute m_dirty_count.
         dirty_count += entry.IsDirty();
 
         // Count the number of entries we expect in the linked list.
diff --git a/src/coins.h b/src/coins.h
index f897bce749..474cf77fdb 100644
--- a/src/coins.h
+++ b/src/coins.h
@@ -272,11 +272,11 @@ struct CoinsViewCacheCursor
     //! Calling CCoinsMap::clear() afterwards is faster because a CoinsCachePair cannot be coerced back into a
     //! CCoinsMap::iterator to be erased, and must therefore be looked up again by key in the CCoinsMap before being erased.
     CoinsViewCacheCursor(size_t& usage LIFETIMEBOUND,
-                        size_t& dirty LIFETIMEBOUND,
-                        CoinsCachePair& sentinel LIFETIMEBOUND,
-                        CCoinsMap& map LIFETIMEBOUND,
-                        bool will_erase) noexcept
-        : m_usage(usage), m_dirty(dirty), m_sentinel(sentinel), m_map(map), m_will_erase(will_erase) {}
+                         size_t& dirty_count LIFETIMEBOUND,
+                         CoinsCachePair& sentinel LIFETIMEBOUND,
+                         CCoinsMap& map LIFETIMEBOUND,
+                         bool will_erase) noexcept
+        : m_usage(usage), m_dirty_count(dirty_count), m_sentinel(sentinel), m_map(map), m_will_erase(will_erase) {}
 
     inline CoinsCachePair* Begin() const noexcept { return m_sentinel.second.Next(); }
     inline CoinsCachePair* End() const noexcept { return &m_sentinel; }
@@ -285,7 +285,7 @@ struct CoinsViewCacheCursor
     inline CoinsCachePair* NextAndMaybeErase(CoinsCachePair& current) noexcept
     {
         const auto next_entry{current.second.Next()};
-        m_dirty -= current.second.IsDirty();
+        m_dirty_count -= current.second.IsDirty();
         // If we are not going to erase the cache, we must still erase spent entries.
         // Otherwise, clear the state of the entry.
         if (!m_will_erase) {
@@ -300,9 +300,11 @@ struct CoinsViewCacheCursor
     }
 
     inline bool WillErase(CoinsCachePair& current) const noexcept { return m_will_erase || current.second.coin.IsSpent(); }
+    size_t GetDirtyCount() const noexcept { return m_dirty_count; }
+
 private:
     size_t& m_usage;
-    size_t& m_dirty;
+    size_t& m_dirty_count;
     CoinsCachePair& m_sentinel;
     CCoinsMap& m_map;
     bool m_will_erase;
diff --git a/src/test/coins_tests.cpp b/src/test/coins_tests.cpp
index 261859ff9a..0cef222024 100644
--- a/src/test/coins_tests.cpp
+++ b/src/test/coins_tests.cpp
@@ -99,8 +99,8 @@ public:
 
     CCoinsMap& map() const { return cacheCoins; }
     CoinsCachePair& sentinel() const { return m_sentinel; }
-    size_t& usage() const { return cachedCoinsUsage; }
-    size_t& GetDirtyCount() const { return m_dirty_count; }
+    void AddUsage(size_t usage) const { cachedCoinsUsage += usage; }
+    void AddDirtyCount(size_t count) const { m_dirty_count += count; }
 };
 
 } // namespace
@@ -197,8 +197,11 @@ void SimulationTest(CCoinsView* base, bool fake_best_block)
                     (coin.IsSpent() ? added_an_entry : updated_an_entry) = true;
                     coin = newcoin;
                 }
-                bool is_overwrite = !coin.IsSpent() || m_rng.rand32() & 1;
-                stack.back()->AddCoin(COutPoint(txid, 0), std::move(newcoin), is_overwrite);
+                if (COutPoint op(txid, 0); !stack.back()->map().contains(op) && !(coin.IsSpent() || (m_rng.rand32() & 1))) {
+                    stack.back()->EmplaceCoinInternalDANGER(std::move(op), std::move(newcoin));
+                } else {
+                    stack.back()->AddCoin(std::move(op), std::move(newcoin), /*possible_overwrite=*/(!coin.IsSpent() || (m_rng.rand32() & 1)));
+                }
             } else {
                 // Spend the coin.
                 removed_an_entry = true;
@@ -653,8 +656,8 @@ static void WriteCoinsViewEntry(CCoinsView& view, const MaybeCoin& cache_coin)
     CCoinsMapMemoryResource resource;
     CCoinsMap map{0, CCoinsMap::hasher{}, CCoinsMap::key_equal{}, &resource};
     auto usage{cache_coin ? InsertCoinsMapEntry(map, sentinel, *cache_coin) : 0};
-    size_t dirty = cache_coin ? cache_coin->IsDirty() : 0;
-    auto cursor{CoinsViewCacheCursor(usage, dirty, sentinel, map, /*will_erase=*/true)};
+    size_t dirty_count = cache_coin ? cache_coin->IsDirty() : 0;
+    auto cursor{CoinsViewCacheCursor(usage, dirty_count, sentinel, map, /*will_erase=*/true)};
     BOOST_CHECK(view.BatchWrite(cursor, {}));
 }
 
@@ -666,8 +669,8 @@ public:
         auto base_cache_coin{base_value == ABSENT ? MISSING : CoinEntry{base_value, CoinEntry::State::DIRTY}};
         WriteCoinsViewEntry(base, base_cache_coin);
         if (cache_coin) {
-            cache.usage() += InsertCoinsMapEntry(cache.map(), cache.sentinel(), *cache_coin);
-            cache.GetDirtyCount() += cache_coin->IsDirty();
+            cache.AddUsage(InsertCoinsMapEntry(cache.map(), cache.sentinel(), *cache_coin));
+            cache.AddDirtyCount(cache_coin->IsDirty());
         }
     }
 
diff --git a/src/test/fuzz/coins_view.cpp b/src/test/fuzz/coins_view.cpp
index a9b91b9efe..64cec03fd5 100644
--- a/src/test/fuzz/coins_view.cpp
+++ b/src/test/fuzz/coins_view.cpp
@@ -123,7 +123,7 @@ FUZZ_TARGET(coins_view, .init = initialize_coins_view)
                 CoinsCachePair sentinel{};
                 sentinel.second.SelfRef(sentinel);
                 size_t usage{0};
-                size_t num_dirty{0};
+                size_t dirty_count{0};
                 CCoinsMapMemoryResource resource;
                 CCoinsMap coins_map{0, SaltedOutpointHasher{/*deterministic=*/true}, CCoinsMap::key_equal{}, &resource};
                 LIMITED_WHILE(good_data && fuzzed_data_provider.ConsumeBool(), 10'000)
@@ -145,11 +145,11 @@ FUZZ_TARGET(coins_view, .init = initialize_coins_view)
                     if (dirty) CCoinsCacheEntry::SetDirty(*it, sentinel);
                     if (fresh) CCoinsCacheEntry::SetFresh(*it, sentinel);
                     usage += it->second.coin.DynamicMemoryUsage();
-                    num_dirty += dirty;
+                    dirty_count += dirty;
                 }
                 bool expected_code_path = false;
                 try {
-                    auto cursor{CoinsViewCacheCursor(usage, num_dirty, sentinel, coins_map, /*will_erase=*/true)};
+                    auto cursor{CoinsViewCacheCursor(usage, dirty_count, sentinel, coins_map, /*will_erase=*/true)};
                     coins_view_cache.BatchWrite(cursor, fuzzed_data_provider.ConsumeBool() ? ConsumeUInt256(fuzzed_data_provider) : coins_view_cache.GetBestBlock());
                     expected_code_path = true;
                 } catch (const std::logic_error& e) {
diff --git a/src/txdb.cpp b/src/txdb.cpp
index 1622039d63..9587a62db7 100644
--- a/src/txdb.cpp
+++ b/src/txdb.cpp
@@ -25,6 +25,9 @@ static constexpr uint8_t DB_HEAD_BLOCKS{'H'};
 // Keys used in previous version that might still be found in the DB:
 static constexpr uint8_t DB_COINS{'c'};
 
+// Threshold for warning when writing this many dirty cache entries to disk.
+static constexpr size_t WARN_FLUSH_COINS_COUNT{10'000'000};
+
 bool CCoinsViewDB::NeedsUpgrade()
 {
     std::unique_ptr<CDBIterator> cursor{m_db->NewIterator()};
@@ -93,6 +96,7 @@ std::vector<uint256> CCoinsViewDB::GetHeadBlocks() const {
 bool CCoinsViewDB::BatchWrite(CoinsViewCacheCursor& cursor, const uint256 &hashBlock) {
     CDBBatch batch(*m_db);
     size_t count = 0;
+    size_t dirty_count{cursor.GetDirtyCount()};
     size_t changed = 0;
     assert(!hashBlock.IsNull());
 
@@ -109,6 +113,8 @@ bool CCoinsViewDB::BatchWrite(CoinsViewCacheCursor& cursor, const uint256 &hashB
         }
     }
 
+    if (dirty_count > WARN_FLUSH_COINS_COUNT) LogWarning("Flushing large (%d entries) UTXO set to disk, it may take several minutes", dirty_count);
+
     // In the first batch, mark the database as being in the middle of a
     // transition from old_tip to hashBlock.
     // A vector is used for future extensibility, as we may want to support
@@ -145,9 +151,10 @@ bool CCoinsViewDB::BatchWrite(CoinsViewCacheCursor& cursor, const uint256 &hashB
     batch.Erase(DB_HEAD_BLOCKS);
     batch.Write(DB_BEST_BLOCK, hashBlock);
 
-    LogDebug(BCLog::COINDB, "Writing final batch of %.2f MiB\n", batch.SizeEstimate() * (1.0 / 1048576.0));
+    LogDebug(BCLog::COINDB, "Writing final batch of %.2f MiB", batch.SizeEstimate() * (1.0 / 1048576.0));
     bool ret = m_db->WriteBatch(batch);
-    LogDebug(BCLog::COINDB, "Committed %u changed transaction outputs (out of %u) to coin database...\n", (unsigned int)changed, (unsigned int)count);
+    assert(changed == dirty_count); // TODO remove `changed` after all scenarios pass
+    LogDebug(BCLog::COINDB, "Committed %u changed transaction outputs (out of %u) to coin database...", (unsigned int)dirty_count, (unsigned int)count);
     return ret;
 }
 
diff --git a/src/validation.cpp b/src/validation.cpp
index 1f4f2e1ae8..3323a037dc 100644
--- a/src/validation.cpp
+++ b/src/validation.cpp
@@ -88,8 +88,6 @@ using node::CBlockIndexHeightOnlyComparator;
 using node::CBlockIndexWorkComparator;
 using node::SnapshotMetadata;
 
-/** Threshold for warning when writing this many dirty cache entries to disk. */
-static constexpr size_t WARN_FLUSH_COINS_COUNT = 10'000'000;
 /** Time to wait between writing blocks/block index to disk. */
 static constexpr std::chrono::hours DATABASE_WRITE_INTERVAL{1};
 /** Time to wait between flushing chainstate to disk. */
@@ -2932,7 +2930,6 @@ bool Chainstate::FlushStateToDisk(
         }
         // Flush best chain related state. This can only be done if the blocks / block index write was also done.
         if (fDoFullFlush && !CoinsTip().GetBestBlock().IsNull()) {
-            if (coins_dirty_count >= WARN_FLUSH_COINS_COUNT) LogWarning("Flushing large (%d entries) UTXO set to disk, it may take several minutes", coins_dirty_count);
             LOG_TIME_MILLIS_WITH_CATEGORY(strprintf("write coins cache to disk (%d out of %d cached coins)",
                 coins_dirty_count, coins_count), BCLog::BENCH);

@@ -272,10 +272,11 @@ struct CoinsViewCacheCursor
//! Calling CCoinsMap::clear() afterwards is faster because a CoinsCachePair cannot be coerced back into a
//! CCoinsMap::iterator to be erased, and must therefore be looked up again by key in the CCoinsMap before being erased.
CoinsViewCacheCursor(size_t& usage LIFETIMEBOUND,
size_t& dirty LIFETIMEBOUND,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: parameters slightly unaligned with first parameter

CoinsCachePair& sentinel LIFETIMEBOUND,
CCoinsMap& map LIFETIMEBOUND,
bool will_erase) noexcept
: m_usage(usage), m_sentinel(sentinel), m_map(map), m_will_erase(will_erase) {}
: m_usage(usage), m_dirty(dirty), m_sentinel(sentinel), m_map(map), m_will_erase(will_erase) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to track CCoinsViewDB::BatchWrite#changed now that we're tracking dirty count already?
I think we should be able to use the new m_dirty(_count) there instead, i.e. something like:

size_t dirty_count{cursor.GetDirtyCount()};
size_t changed{0};
...
if (dirty_count > WARN_FLUSH_COINS_COUNT) LogWarning("Flushing large (%d entries) UTXO set to disk, it may take several minutes", dirty_count);
...
assert(changed == dirty_count); // TODO remove `changed` after all scenarios pass
LogDebug(BCLog::COINDB, "Committed %u changed transaction outputs (out of %u) to coin database...", (unsigned int)dirty_count, (unsigned int)count);

For me locally this passes all tests, quick IBD (ran it with -dbcache=5 -maxmempool=5 -stopatheight=120000), reindex and AssumeUTXO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be useful to avoid the needless DB_HEAD_BLOCKS erase and rewrite for empty batches.

@@ -327,6 +342,9 @@ void CCoinsViewCache::SanityCheck() const
// Recompute cachedCoinsUsage.
recomputed_usage += entry.coin.DynamicMemoryUsage();

// Recompute m_num_dirty;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover num_dirty (& other comments here end with fullstop)

Suggested change
// Recompute m_num_dirty;
// Recompute m_dirty_count.

// Typical Coin structures on disk are around 48 bytes in size.
size_t approx_disk_usage = 48 * coins_dirty_count;
if (approx_disk_usage >= WARN_FLUSH_COINS_SIZE) LogWarning("Flushing large (%.1f GiB) UTXO set to disk, it may take several minutes", double(approx_disk_usage) / (1 << 30));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, currently AssumeUTXO doesn't display the warning:

2025-03-18T10:08:47Z [snapshot] 184000000 coins loaded (99.56%, 29332 MB)
2025-03-18T10:08:48Z [snapshot] loaded 184821030 (29456 MB) coins from snapshot 000000000000000000010b17283c3c400507969a9c2afd1dcf2082ec5cca2880
2025-03-18T10:08:48Z FlushSnapshotToDisk: saving snapshot chainstate (29456 MB) started
2025-03-18T10:09:57Z FlushSnapshotToDisk: completed (69461.66ms)

But if we moved the warning introduced in #31534 into to the method that does the actual long operation it would show the warning for AssumeUTXO case as well (i.e. both for Flush and Sync):

diff --git a/src/coins.h b/src/coins.h
index f897bce749..a23b663a1a 100644
--- a/src/coins.h
+++ b/src/coins.h
@@ -300,6 +300,7 @@ struct CoinsViewCacheCursor
     }
 
     inline bool WillErase(CoinsCachePair& current) const noexcept { return m_will_erase || current.second.coin.IsSpent(); }
+    size_t GetDirtyCount() const noexcept { return m_dirty; }
 private:
     size_t& m_usage;
     size_t& m_dirty;
diff --git a/src/txdb.cpp b/src/txdb.cpp
index 1622039d63..083cab3279 100644
--- a/src/txdb.cpp
+++ b/src/txdb.cpp
@@ -25,6 +25,9 @@ static constexpr uint8_t DB_HEAD_BLOCKS{'H'};
 // Keys used in previous version that might still be found in the DB:
 static constexpr uint8_t DB_COINS{'c'};
 
+// Threshold for warning when writing this many dirty cache entries to disk.
+static constexpr size_t WARN_FLUSH_COINS_COUNT{10'000'000};
+
 bool CCoinsViewDB::NeedsUpgrade()
 {
     std::unique_ptr<CDBIterator> cursor{m_db->NewIterator()};
@@ -109,6 +112,8 @@ bool CCoinsViewDB::BatchWrite(CoinsViewCacheCursor& cursor, const uint256 &hashB
         }
     }
 
+    if (cursor.GetDirtyCount() > WARN_FLUSH_COINS_COUNT) LogWarning("Flushing large (%d entries) UTXO set to disk, it may take several minutes", cursor.GetDirtyCount());
+
     // In the first batch, mark the database as being in the middle of a
     // transition from old_tip to hashBlock.
     // A vector is used for future extensibility, as we may want to support
diff --git a/src/validation.cpp b/src/validation.cpp
index 1f4f2e1ae8..3323a037dc 100644
--- a/src/validation.cpp
+++ b/src/validation.cpp
@@ -88,8 +88,6 @@ using node::CBlockIndexHeightOnlyComparator;
 using node::CBlockIndexWorkComparator;
 using node::SnapshotMetadata;
 
-/** Threshold for warning when writing this many dirty cache entries to disk. */
-static constexpr size_t WARN_FLUSH_COINS_COUNT = 10'000'000;
 /** Time to wait between writing blocks/block index to disk. */
 static constexpr std::chrono::hours DATABASE_WRITE_INTERVAL{1};
 /** Time to wait between flushing chainstate to disk. */
@@ -2932,7 +2930,6 @@ bool Chainstate::FlushStateToDisk(
         }
         // Flush best chain related state. This can only be done if the blocks / block index write was also done.
         if (fDoFullFlush && !CoinsTip().GetBestBlock().IsNull()) {
-            if (coins_dirty_count >= WARN_FLUSH_COINS_COUNT) LogWarning("Flushing large (%d entries) UTXO set to disk, it may take several minutes", coins_dirty_count);
             LOG_TIME_MILLIS_WITH_CATEGORY(strprintf("write coins cache to disk (%d out of %d cached coins)",
                 coins_dirty_count, coins_count), BCLog::BENCH);

Reproducer:

cmake -B build -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc)
mkdir -p demo && rm -rfd demo/chainstate demo/chainstate_snapshot demo/debug.log
build/bin/bitcoind -datadir=demo -stopatheight=1
build/bin/bitcoind -datadir=demo -daemon -blocksonly=1 -connect=0 -dbcache=30000
build/bin/bitcoin-cli -datadir=demo -rpcclienttimeout=0 loadtxoutset ~/utxo-880000.dat
build/bin/bitcoin-cli -datadir=demo stop

Result:

2025-03-18T09:59:40Z [snapshot] 184000000 coins loaded (99.56%, 29332 MB)
2025-03-18T09:59:41Z [snapshot] loaded 184821030 (29456 MB) coins from snapshot 000000000000000000010b17283c3c400507969a9c2afd1dcf2082ec5cca2880
2025-03-18T09:59:41Z FlushSnapshotToDisk: saving snapshot chainstate (29456 MB) started
*2025-03-18T09:59:41Z [warning] Flushing large (184821030 entries) UTXO set to disk, it may take several minutes*
2025-03-18T10:00:50Z FlushSnapshotToDisk: completed (68776.27ms)

Would it be simpler if I pushed this change in a separate PR instead?

@@ -113,7 +115,10 @@ void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possi
void CCoinsViewCache::EmplaceCoinInternalDANGER(COutPoint&& outpoint, Coin&& coin) {
cachedCoinsUsage += coin.DynamicMemoryUsage();
auto [it, inserted] = cacheCoins.try_emplace(std::move(outpoint), std::move(coin));
if (inserted) CCoinsCacheEntry::SetDirty(*it, m_sentinel);
if (inserted) {
CCoinsCacheEntry::SetDirty(*it, m_sentinel);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following @andrewtoth's solution we can do:

void CCoinsViewCache::EmplaceCoinInternalDANGER(COutPoint&& outpoint, Coin&& coin) {
    const auto mem_usage{coin.DynamicMemoryUsage()};
    auto [it, inserted] = cacheCoins.try_emplace(std::move(outpoint), std::move(coin));
    if (inserted) {
        cachedCoinsUsage += mem_usage;
        CCoinsCacheEntry::SetDirty(*it, m_sentinel);
        ++m_dirty_count;
    }
}

src/coins.cpp Outdated
@@ -343,6 +359,7 @@ void CCoinsViewCache::SanityCheck() const
}
assert(count_linked == count_flagged);
assert(recomputed_usage == cachedCoinsUsage);
assert(num_dirty == m_num_dirty);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do this later to test reorg behavior in a functional test, you can resolve this comment.

@@ -123,6 +123,7 @@ FUZZ_TARGET(coins_view, .init = initialize_coins_view)
CoinsCachePair sentinel{};
sentinel.second.SelfRef(sentinel);
size_t usage{0};
size_t num_dirty{0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover num_dirty

CoinsCachePair& sentinel LIFETIMEBOUND,
CCoinsMap& map LIFETIMEBOUND,
bool will_erase) noexcept
: m_usage(usage), m_sentinel(sentinel), m_map(map), m_will_erase(will_erase) {}
: m_usage(usage), m_dirty(dirty), m_sentinel(sentinel), m_map(map), m_will_erase(will_erase) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind if we're mixing booleans and ints as long as the types and names are obvious, e.g.

size_t dirty = cache_coin ? cache_coin->IsDirty() : 0;
auto cursor{CoinsViewCacheCursor(usage, dirty, sentinel, map, /*will_erase=*/true)};

where we're setting a parameter called dirty to the result of IsDirty, but it gets converted to dirty count.
We could obviate this by calling it dirty_count here as well:

    CoinsViewCacheCursor(size_t& usage LIFETIMEBOUND,
                         size_t& dirty_count LIFETIMEBOUND,
                         CoinsCachePair& sentinel LIFETIMEBOUND,
                         CCoinsMap& map LIFETIMEBOUND,
                         bool will_erase) noexcept
        : m_usage(usage), m_dirty_count(dirty_count), m_sentinel(sentinel), m_map(map), m_will_erase(will_erase) {}

Comment on lines +248 to +249
++m_dirty_count;
CCoinsCacheEntry::SetDirty(*itUs, m_sentinel);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in other cases we set it to dirty first and only increment m_dirty_count after (shouldn't matter on single thread but at least we don't have to think about whether that's the case or not):

Suggested change
++m_dirty_count;
CCoinsCacheEntry::SetDirty(*itUs, m_sentinel);
CCoinsCacheEntry::SetDirty(*itUs, m_sentinel);
++m_dirty_count;

@sipa
Copy link
Member Author

sipa commented Mar 31, 2025

@l0rinc I've lost track a bit of your comments/suggestions. Is the PR as-is broken when interacting with assumeutxo?

@DrahtBot
Copy link
Contributor

DrahtBot commented May 1, 2025

🐙 This pull request conflicts with the target branch and needs rebase.

@DrahtBot
Copy link
Contributor

⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?

  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

@l0rinc
Copy link
Contributor

l0rinc commented Jul 29, 2025

@sipa, my comments are still relevant as far as I can tell, I think it's a good change, how can I help to advance it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants