Fix tiebreak when loading blocks from disk (and add tests for comparing chain ties) #29640

sr-gi · 2024-03-12T20:11:04Z

This PR grabs some interesting bits from #29284 and fixes some edge cases in how block tiebreaks are dealt with.

Regarding #29284

The main functionality from the PR was dropped given it was not an issue anymore, however, reviewers pointed out some comments were outdated #29284 (comment) (which to my understanding may have led to thinking that there was still an issue) it also added test coverage for the aforementioned case which was already passing on master and is useful to keep.

New functionality

While reviewing the superseded PR, it was noticed that blocks that are loaded from disk may face a similar issue (check #29284 (comment) for more context).

The issue comes from how tiebreaks for equal work blocks are handled: if two blocks have the same amount of work, the one that is activatable first wins, that is, the one for which we have all its data (and all of its ancestors'). The variable that keeps track of this, within CBlockIndex is nSequenceId, which is not persisted over restarts. This means that when a node is restarted, all blocks loaded from disk are defaulted the same nSequenceId: 0.
Now, when trying to decide what chain is best on loading blocks from disk, the previous tiebreaker rule is not decisive anymore, so the CBlockIndexWorkComparator has to default to its last rule: whatever block is loaded first (has a smaller memory address).

This means that if multiple same work tip candidates were available before restarting the node, it could be the case that the selected chain tip after restarting does not match the one before.

Therefore, the way nSequenceId is initialized is changed to:

0 for blocks that belong to the previously known best chain
1 to all other blocks loaded from disk

DrahtBot · 2024-03-12T20:11:07Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/29640.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Stale ACK	sipa, mzumsande, furszy

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#32885 (Cache m_cached_finished_ibd where SetTip is called. by pstratem)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

src/validation.cpp

DrahtBot · 2024-03-14T20:52:19Z

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

_{Debug: https://github.com/bitcoin/bitcoin/runs/22675498262}

sr-gi · 2024-03-25T07:45:47Z

Rebased to drop the custom log fix in favor of a more generic solution (#29640 (comment))

src/node/blockstorage.cpp

src/validation.cpp

DrahtBot · 2024-07-23T20:51:49Z

🚧 At least one of the CI tasks failed.
_{Debug: https://github.com/bitcoin/bitcoin/runs/26038919395}

Hints

Make sure to run all tests locally, according to the documentation.

The failure may happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

sr-gi · 2024-07-24T15:11:55Z

Rebased to deal with CI failing.

c76f4c6 has been amended to include __file__ in the ChainTiebreaksTest constructor, as required by any subclass of BitcoinTestFramework since #30463

Before this, if we had two (or more) same work tip candidates and restarted our node, it could be the case that the block set as tip after bootstrap didn't match the one before stopping. That's because the work and `nSequenceId` of both block will be the same (the latter is only kept in memory), so the active chain after restart would have depended on what tip candidate was loaded first. This makes sure that we are consistent over reboots. Github-Pull: bitcoin#29640 Rebased-From: c8f5e62

Make it easier to follow what the values come without having to go over the comments, plus easier to maintain Github-Pull: bitcoin#29640 Rebased-From: c7f9061

Adds tests to make sure we are consistent on activating the same chain over a node restart if two or more candidates have the same work when the node is shutdown Github-Pull: bitcoin#29640 Rebased-From: 177d07f

TheCharlatan · 2025-06-17T11:16:08Z

src/chain.h

@@ -35,6 +35,9 @@ static constexpr int64_t MAX_FUTURE_BLOCK_TIME = 2 * 60 * 60;
 * MAX_FUTURE_BLOCK_TIME.
 */
 static constexpr int64_t TIMESTAMP_WINDOW = MAX_FUTURE_BLOCK_TIME;
+ //! Init values for CBlockIndex nSequenceId when loaded from disk


Nit: The extra whitespace at the beginning should be removed.

Covered in 3869469

TheCharlatan · 2025-06-17T12:11:02Z

test/functional/feature_chain_tiebreaks.py

+        assert_equal(node.getbestblockhash(), blocks[2].hash)
+
+        self.log.info('Send parents B3-B4 of B8-B10 in reverse order')
+        peer.send_blocks_and_test([blocks[4]], node, success=False, force_send=True)


It would be an easy doc fix, something like:

- - if success is True: assert that the node's tip advances to the most recent block - - if success is False: assert that the node's tip doesn't advance + - if success is True: assert that the node's tip is the last block in blocks at the end of the operation. + - if success is False: assert that the node's tip isn't the last block in blocks at the end of the operation

Will submit separately if it does not get picked up here.

test/functional/feature_chain_tiebreaks.py

sr-gi · 2025-06-20T15:09:14Z

Thanks for reviewing @TheCharlatan and @furszy. I rebased the code and addressed the outstanding comments.

maflcko · 2025-07-03T07:31:24Z

test/functional/feature_chain_tiebreaks.py

+            blocks[-1].solve()
+
+        # Send blocks and test the last one is not connected
+        self.log.info('Send A1 and A2. Make sure than only the former connects')


Possible typos and grammar issues:

In feature_chain_tiebreaks.py: “Make sure than only the former connects” -> “Make sure that only the former connects” [‘than’ should be ‘that’] In feature_chain_tiebreaks.py: “# Restart and check enough times to this to eventually fail if the logic is broken” -> “# Restart and check this enough times to eventually fail if the logic is broken” [‘to this’ is misplaced and hinders comprehension]

Fixed in 8d73371

sr-gi · 2025-07-08T16:37:27Z

I think all feedback has been addressed @mzumsande @sipa @furszy @TheCharlatan @maflcko

Please let me know if there's anything else pending or if anything doesn't make sense.

test/functional/feature_chain_tiebreaks.py

sr-gi · 2025-07-21T14:35:58Z

Rebased to switch from the old (removed) CBlock::hash property to the recently introduced hash_int and hash_hex

Before this, if we had two (or more) same work tip candidates and restarted our node, it could be the case that the block set as tip after bootstrap didn't match the one before stopping. That's because the work and `nSequenceId` of both block will be the same (the latter is only kept in memory), so the active chain after restart would have depended on what tip candidate was loaded first. This makes sure that we are consistent over reboots.

Make it easier to follow what the values come without having to go over the comments, plus easier to maintain

Adds tests to make sure we are consistent on activating the same chain over a node restart if two or more candidates have the same work when the node is shutdown

It's not true that if success=False the tip doesn't advance. It doesn'test advance to the provided tip, but it can advance to a competing one

Github-Pull: bitcoin#29640 Rebased-From: ab145cb

Github-Pull: bitcoin#29640 Rebased-From: 5370bed

Before this, if we had two (or more) same work tip candidates and restarted our node, it could be the case that the block set as tip after bootstrap didn't match the one before stopping. That's because the work and `nSequenceId` of both block will be the same (the latter is only kept in memory), so the active chain after restart would have depended on what tip candidate was loaded first. This makes sure that we are consistent over reboots. Github-Pull: bitcoin#29640 Rebased-From: 8b91883

Make it easier to follow what the values come without having to go over the comments, plus easier to maintain Github-Pull: bitcoin#29640 Rebased-From: 18524b0

Adds tests to make sure we are consistent on activating the same chain over a node restart if two or more candidates have the same work when the node is shutdown Github-Pull: bitcoin#29640 Rebased-From: 09c95f2

maflcko mentioned this pull request Mar 12, 2024

Choose earliest-activatable as tie breaker between equal-work chains #29284

Closed

DrahtBot mentioned this pull request Mar 13, 2024

scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] #29641

Draft

maflcko reviewed Mar 13, 2024

View reviewed changes

src/validation.cpp Outdated Show resolved Hide resolved

DrahtBot mentioned this pull request Mar 13, 2024

Avoid divide-by-zero in header sync logs when NodeClock is behind #29647

Merged

sr-gi changed the title ~~Adds missing test to chain ties (CBlockIndexWorkComparator)~~ Fixes tiebreak when loading blocks from disk and adds missing test to chain ties (CBlockIndexWorkComparator) Mar 14, 2024

sr-gi changed the title ~~Fixes tiebreak when loading blocks from disk and adds missing test to chain ties (CBlockIndexWorkComparator)~~ Fix tiebreak when loading blocks from disk (and add tests for comparing chain ties) Mar 14, 2024

sr-gi force-pushed the 202403-block-tiebreak branch from 43873be to 18820ac Compare March 14, 2024 20:52

DrahtBot added the CI failed label Mar 14, 2024

sr-gi force-pushed the 202403-block-tiebreak branch 3 times, most recently from a705e00 to 77e1c45 Compare March 15, 2024 21:09

DrahtBot removed the CI failed label Mar 15, 2024

sr-gi force-pushed the 202403-block-tiebreak branch 2 times, most recently from 8225bd6 to f95e896 Compare March 22, 2024 15:23

DrahtBot added the Needs rebase label Mar 22, 2024

sr-gi force-pushed the 202403-block-tiebreak branch from f95e896 to e4cf8d0 Compare March 25, 2024 07:44

DrahtBot removed the Needs rebase label Mar 25, 2024

luke-jr reviewed Jun 7, 2024

View reviewed changes

src/node/blockstorage.cpp Outdated Show resolved Hide resolved

mzumsande reviewed Jun 8, 2024

View reviewed changes

src/validation.cpp Outdated Show resolved Hide resolved

src/validation.cpp Outdated Show resolved Hide resolved

src/validation.cpp Outdated Show resolved Hide resolved

sr-gi force-pushed the 202403-block-tiebreak branch from e4cf8d0 to 2cdc369 Compare June 10, 2024 18:15

This was referenced Jun 25, 2024

RFC: Instanced logs #30338

Closed

kernel, logging: Pass Logger instances to kernel objects #30342

Draft

DrahtBot added the CI failed label Jul 23, 2024

sr-gi force-pushed the 202403-block-tiebreak branch from 2cdc369 to c6ca2a1 Compare July 24, 2024 15:08

DrahtBot removed the CI failed label Jul 25, 2024

TheCharlatan reviewed Jun 17, 2025

View reviewed changes

sr-gi force-pushed the 202403-block-tiebreak branch from 177d07f to 3cb689a Compare June 20, 2025 14:59

maflcko reviewed Jul 3, 2025

View reviewed changes

sr-gi force-pushed the 202403-block-tiebreak branch from 3cb689a to a4b9381 Compare July 3, 2025 16:49

DrahtBot mentioned this pull request Jul 7, 2025

Cache m_cached_finished_ibd where SetTip is called. #32885

Open

DrahtBot mentioned this pull request Jul 14, 2025

log: [refactor] Use info level for init logs #32967

Merged

DrahtBot added the CI failed label Jul 19, 2025

DrahtBot reviewed Jul 21, 2025

View reviewed changes

test/functional/feature_chain_tiebreaks.py Outdated Show resolved Hide resolved

sr-gi force-pushed the 202403-block-tiebreak branch from a4b9381 to fb31512 Compare July 21, 2025 14:34

DrahtBot added Needs rebase and removed CI failed labels Jul 21, 2025

sr-gi and others added 6 commits July 28, 2025 10:11

Updates CBlockIndexWorkComparator outdated comment

ab145cb

test: add functional test for complex reorgs

5370bed

Make nSequenceId init value constants

18524b0

Make it easier to follow what the values come without having to go over the comments, plus easier to maintain

test: Adds block tiebreak over restarts tests

09c95f2

Adds tests to make sure we are consistent on activating the same chain over a node restart if two or more candidates have the same work when the node is shutdown

test: Fixes send_blocks_and_test docs

0465574

It's not true that if success=False the tip doesn't advance. It doesn'test advance to the provided tip, but it can advance to a competing one

sr-gi force-pushed the 202403-block-tiebreak branch from fb31512 to 0465574 Compare July 28, 2025 14:15

DrahtBot removed the Needs rebase label Jul 28, 2025

luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Jul 31, 2025

Updates CBlockIndexWorkComparator outdated comment

3aeacf5

Github-Pull: bitcoin#29640 Rebased-From: ab145cb

luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Jul 31, 2025

test: add functional test for complex reorgs

dbca0cc

Github-Pull: bitcoin#29640 Rebased-From: 5370bed

Fix tiebreak when loading blocks from disk (and add tests for comparing chain ties) #29640

Are you sure you want to change the base?

Fix tiebreak when loading blocks from disk (and add tests for comparing chain ties) #29640

Conversation

sr-gi commented Mar 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regarding #29284

New functionality

Uh oh!

DrahtBot commented Mar 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage & Benchmarks

Reviews

Conflicts

Uh oh!

Uh oh!

DrahtBot commented Mar 14, 2024

Uh oh!

sr-gi commented Mar 25, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DrahtBot commented Jul 23, 2024

Uh oh!

sr-gi commented Jul 24, 2024

Uh oh!

TheCharlatan Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

sr-gi Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

TheCharlatan Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sr-gi commented Jun 20, 2025

Uh oh!

maflcko Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

sr-gi Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

sr-gi commented Jul 8, 2025

Uh oh!

Uh oh!

sr-gi commented Jul 21, 2025

Uh oh!

Uh oh!

sr-gi commented Mar 12, 2024 •

edited

Loading

DrahtBot commented Mar 12, 2024 •

edited

Loading