Skip to content

Conversation

jonatack
Copy link
Member

@jonatack jonatack commented Jun 12, 2022

Fix a few design issues, potential footguns and inconsistent behavior in the CBlockIndex and CDiskBlockIndex classes.

  • Ensure phashBlock in CBlockIndex#GetBlockHash() is not nullptr before dereferencing and remove a now-redundant assert preceding a GetBlockHash() caller. This protects against UB here, and in case of failure (which would indicate a consensus bug), the debug log will print bitcoind: chain.h:265: uint256 CBlockIndex::GetBlockHash() const: Assertion 'phashBlock != nullptr' failed. Aborted instead of Segmentation fault.
  • Remove the unused CDiskBlockIndex#ToString() class member, and mark the inherited CBlockIndex#ToString() public interface member as deleted to disallow calling it in the derived CDiskBlockIndex class.
  • Rename the CDiskBlockIndex GetBlockHash() class member to ConstructBlockHash(), which also makes sense as they perform different operations to return a blockhash, and mark the inherited CBlockIndex#GetBlockHash() public interface member as deleted to disallow calling it in the derived CDiskBlockIndex class.
  • Move CBlockIndex#ToString() from header to implementation, which also allows dropping tinyformat.h from the header file.

Rationale and discussion regarding the CDiskBlockIndex changes:

Here is a failing test on master that demonstrates the inconsistent behavior of the current design: calling the same inherited public interface functions on the same CDiskBlockIndex object should yield identical behavior, but does not.

diff --git a/src/test/validation_chainstatemanager_tests.cpp b/src/test/validation_chainstatemanager_tests.cpp
index 6dc522b421..dac3840f32 100644
--- a/src/test/validation_chainstatemanager_tests.cpp
+++ b/src/test/validation_chainstatemanager_tests.cpp
@@ -240,6 +240,15 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_activate_snapshot, TestChain100Setup)

     const CBlockIndex* tip = chainman.ActiveTip();

     BOOST_CHECK_EQUAL(tip->nChainTx, au_data.nChainTx);

+    // CDiskBlockIndex "is a" CBlockIndex, as it publicly inherits from it.
+    // Test that calling the same inherited interface functions on the same
+    // object yields identical behavior.
+    CDiskBlockIndex index{tip};
+    CBlockIndex *pB = &index;
+    CDiskBlockIndex *pD = &index;
+    BOOST_CHECK_EQUAL(pB->GetBlockHash(), pD->GetBlockHash());
+    BOOST_CHECK_EQUAL(pB->ToString(), pD->ToString());

(build and run: $ ./src/test/test_bitcoin -t validation_chainstatemanager_tests)

The GetBlockHash() test assertion only passes on master because the different methods invoked by the current design happen to return the same result. If one of the two is changed, it fails like the ToString() assertion does.

Redefining inherited non-virtual functions is well-documented as incorrect design to avoid inconsistent behavior (see Scott Meyers, Effective C++, Item 36). Class usage is confusing when the behavior depends on the pointer definition instead of the object definition (static binding happening where dynamic binding was expected). This can lead to unsuspected or hard-to-track bugs.

Outside of critical hot spots, correctness usually comes before optimisation, but the current design dates back to main.cpp and it may possibly have been chosen to avoid the overhead of dynamic dispatch. This solution does the same: the class sizes are unchanged and no vptr or vtbl is added.

There are better designs for doing this that use composition instead of inheritance, or that separate the public interface from the private implementations. One example of the latter would be a non-virtual public interface that calls private virtual implementation methods, i.e. the Template pattern via the Non-Virtual Interface (NVI) idiom.

@jonatack jonatack changed the title refactor: CDiskBlockIndex ToString() and GetBlockHash() for consistent behavior and safety refactor: CDiskBlockIndex ToString() and GetBlockHash() Jun 12, 2022
@jonatack jonatack force-pushed the 2022-06-CDiskBlockIndex-class-design branch from 412dfe5 to 905a4d1 Compare June 12, 2022 14:38
@jonatack jonatack changed the title refactor: CDiskBlockIndex ToString() and GetBlockHash() CDiskBlockIndex: remove ToString and rename GetBlockHash for consistent behavior Jun 13, 2022
@jonatack jonatack changed the title CDiskBlockIndex: remove ToString and rename GetBlockHash for consistent behavior CDiskBlockIndex: remove ToString() and rename GetBlockHash() for consistent behavior Jun 13, 2022
@DrahtBot
Copy link
Contributor

DrahtBot commented Jun 15, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #25311 (remove CBlockIndex copy construction by jamesob)
  • #25023 (Remove unused SetTip(nullptr) code by MarcoFalke)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@jonatack jonatack changed the title CDiskBlockIndex: remove ToString() and rename GetBlockHash() for consistent behavior CDiskBlockIndex: remove unused ToString(), rename GetBlockHash() for consistent behavior Jun 23, 2022
@jonatack
Copy link
Member Author

jonatack commented Jun 23, 2022

The same approach (and a further design refactoring) were also suggested in #25311 (comment).

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 905a4d1

Removing unused code is always good IMO. If needed it can be resurrected from git history or re-implemented.

Some non-blocker questions:

Why is overriding non-virtual functions incorrect design?

I think inheritance implies that the derived class wants to have all the properties of the base class. If it wants just some of them, then at some point maybe composition would be more appropriate than inheritance.

The test from the commit message is expected to fail. If you access a Derived object through a pointer to Base, and both have the same method, then the Base's method is to be called.

Edit: now I see that the alternative approach in #25311 (comment) uses composition instead of inheritance. If indeed the users of CDiskBlockIndex do not use all the stuff provided by CBlockIndex, then that is a better approach.

@jonatack jonatack force-pushed the 2022-06-CDiskBlockIndex-class-design branch from 905a4d1 to 7866c85 Compare July 15, 2022 06:22
@jonatack
Copy link
Member Author

jonatack commented Jul 15, 2022

Updated per git range-diff 02ede4f 905a4d1 7866c85 to take review feedback (thanks!)

Some non-blocker questions:

Why is overriding non-virtual functions incorrect design?

To avoid inconsistent behavior. Public inheritance establishes an invariant over specialization for the derived class. Class usage is confusing when the behavior depends on the pointer definition instead of the object definition (static binding happening where dynamic binding was expected). This can lead to unsuspected or hard-to-track bugs. See Scott Meyers, Effective C++, Item 36: Inherited non-virtual functions should never be redefined, rephrased in urls like https://programmersought.com/article/50007468749/ and https://www.programminghunter.com/article/96792264397/. I updated the PR description with part of this paragraph.

Edit: now I see that the alternative approach in #25311 (comment) uses composition instead of inheritance. If indeed the users of CDiskBlockIndex do not use all the stuff provided by CBlockIndex, then that is a better approach.

I agree it could potentially be a better design as a follow-up / next step (it is a somewhat more invasive refactoring).

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 7866c85

Thanks for the explanation and links!

Now I have justification to clarify the naming of CNetAddr and CService methods which has been confusing me for a long time: #25619

@jonatack jonatack changed the title CDiskBlockIndex: remove unused ToString(), rename GetBlockHash() for consistent behavior CBlockIndex and CDiskBlockIndex improvements for safety and consistent behavior Jul 21, 2022
@jonatack jonatack changed the title CBlockIndex and CDiskBlockIndex improvements for safety and consistent behavior CBlockIndex/CDiskBlockIndex improvements for safety, consistent behavior Jul 21, 2022
@jonatack jonatack force-pushed the 2022-06-CDiskBlockIndex-class-design branch from 2ececec to 0e77f1c Compare July 21, 2022 18:34
@jonatack
Copy link
Member Author

Updated per review feedback by @vasild and @MarcoFalke (thanks!)

@jonatack jonatack force-pushed the 2022-06-CDiskBlockIndex-class-design branch from 0e77f1c to 306cd9d Compare July 21, 2022 19:20
jonatack added 4 commits July 22, 2022 12:42
and remove a now-redundant assert preceding a GetBlockHash() caller.

This protects against UB here, and in case of failure (which would
indicate a consensus bug), the debug log will print

bitcoind: chain.h:265: uint256 CBlockIndex::GetBlockHash() const: Assertion `phashBlock != nullptr' failed.
Aborted

instead of

Segmentation fault
and mark its inherited CBlockIndex#ToString public interface member
as deleted, to disallow calling it in the derived CDiskBlockIndex class.
and mark the inherited CBlockIndex#GetBlockHash public interface member
as deleted, to disallow calling it in the derived CDiskBlockIndex class.

Here is a failing test on master demonstrating the inconsistent behavior of the
current design: calling the same inherited public interface functions on the
same CDiskBlockIndex object should yield identical behavior.

```diff
diff --git a/src/test/validation_chainstatemanager_tests.cpp b/src/test/validation_chainstatemanager_tests.cpp
index 6dc522b..dac3840 100644
--- a/src/test/validation_chainstatemanager_tests.cpp
+++ b/src/test/validation_chainstatemanager_tests.cpp
@@ -240,6 +240,15 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_activate_snapshot, TestChain100Setup)

     const CBlockIndex* tip = chainman.ActiveTip();

     BOOST_CHECK_EQUAL(tip->nChainTx, au_data.nChainTx);

+    // CDiskBlockIndex "is a" CBlockIndex, as it publicly inherits from it.
+    // Test that calling the same inherited interface functions on the same
+    // object yields identical behavior.
+    CDiskBlockIndex index{tip};
+    CBlockIndex *pB = &index;
+    CDiskBlockIndex *pD = &index;
+    BOOST_CHECK_EQUAL(pB->GetBlockHash(), pD->GetBlockHash());
+    BOOST_CHECK_EQUAL(pB->ToString(), pD->ToString());
+
```

The GetBlockHash() test assertion only passes on master because the different
methods invoked by the current design happen to return the same result.  If one
of the two is changed, it fails like the ToString() assertion does.

Redefining inherited non-virtual functions is well-documented as incorrect
design to avoid inconsistent behavior (see Scott Meyers, "Effective C++", Item
36).  Class usage is confusing when the behavior depends on the pointer
definition instead of the object definition (static binding happening where
dynamic binding was expected).  This can lead to unsuspected or hard-to-track
bugs.

Outside of critical hot spots, correctness usually comes before optimisation,
but the current design dates back to main.cpp and it may possibly have been
chosen to avoid the overhead of dynamic dispatch.  This solution does the same:
the class sizes are unchanged and no vptr or vtbl is added.

There are better designs for doing this that use composition instead of
inheritance or that separate the public interface from the private
implementations.  One example of the latter would be a non-virtual public
interface that calls private virtual implementation methods, i.e. the Template
pattern via the Non-Virtual Interface (NVI) idiom.
which allows dropping tinyformat.h from the header file.
@jonatack jonatack force-pushed the 2022-06-CDiskBlockIndex-class-design branch from 306cd9d to 3a61fc5 Compare July 22, 2022 11:04
@jonatack
Copy link
Member Author

Updated per review feedback by @MarcoFalke (thanks!)

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 3a61fc5

@maflcko maflcko merged commit 5057adf into bitcoin:master Jul 25, 2022
@jonatack jonatack deleted the 2022-06-CDiskBlockIndex-class-design branch July 25, 2022 15:01
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jul 25, 2022
…ety, consistent behavior

3a61fc5 refactor: move CBlockIndex#ToString() from header to implementation (Jon Atack)
57865eb CDiskBlockIndex: rename GetBlockHash() to ConstructBlockHash() (Jon Atack)
99e8ec8 CDiskBlockIndex: remove unused ToString() class member (Jon Atack)
14aeece CBlockIndex: ensure phashBlock is not nullptr before dereferencing (Jon Atack)

Pull request description:

  Fix a few design issues, potential footguns and inconsistent behavior in the CBlockIndex and CDiskBlockIndex classes.

  - Ensure phashBlock in `CBlockIndex#GetBlockHash()` is not nullptr before dereferencing and remove a now-redundant assert preceding a GetBlockHash() caller.  This protects against UB here, and in case of failure (which would indicate a consensus bug), the debug log will print `bitcoind: chain.h:265: uint256 CBlockIndex::GetBlockHash() const: Assertion 'phashBlock != nullptr' failed. Aborted` instead of `Segmentation fault`.
  - Remove the unused `CDiskBlockIndex#ToString()` class member, and mark the inherited `CBlockIndex#ToString()` public interface member as deleted to disallow calling it in the derived CDiskBlockIndex class.
  - Rename the `CDiskBlockIndex GetBlockHash()` class member to `ConstructBlockHash()`, which also makes sense as they perform different operations to return a blockhash, and mark the inherited `CBlockIndex#GetBlockHash()` public interface member as deleted to disallow calling it in the derived CDiskBlockIndex class.
  - Move `CBlockIndex#ToString()` from header to implementation, which also allows dropping `tinyformat.h` from the header file.

  Rationale and discussion regarding the CDiskBlockIndex changes:

  Here is a failing test on master that demonstrates the inconsistent behavior of the current design: calling the same inherited public interface functions on the same CDiskBlockIndex object should yield identical behavior, but does not.

  ```diff
  diff --git a/src/test/validation_chainstatemanager_tests.cpp b/src/test/validation_chainstatemanager_tests.cpp
  index 6dc522b..dac3840 100644
  --- a/src/test/validation_chainstatemanager_tests.cpp
  +++ b/src/test/validation_chainstatemanager_tests.cpp
  @@ -240,6 +240,15 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_activate_snapshot, TestChain100Setup)

       const CBlockIndex* tip = chainman.ActiveTip();

       BOOST_CHECK_EQUAL(tip->nChainTx, au_data.nChainTx);

  +    // CDiskBlockIndex "is a" CBlockIndex, as it publicly inherits from it.
  +    // Test that calling the same inherited interface functions on the same
  +    // object yields identical behavior.
  +    CDiskBlockIndex index{tip};
  +    CBlockIndex *pB = &index;
  +    CDiskBlockIndex *pD = &index;
  +    BOOST_CHECK_EQUAL(pB->GetBlockHash(), pD->GetBlockHash());
  +    BOOST_CHECK_EQUAL(pB->ToString(), pD->ToString());
  ```

  (build and run: `$ ./src/test/test_bitcoin -t validation_chainstatemanager_tests`)

  The GetBlockHash() test assertion only passes on master because the different methods invoked by the current design happen to return the same result.  If one of the two is changed, it fails like the ToString() assertion does.

  Redefining inherited non-virtual functions is well-documented as incorrect design to avoid inconsistent behavior (see Scott Meyers, Effective C++, Item 36). Class usage is confusing when the behavior depends on the pointer definition instead of the object definition (static binding happening where dynamic binding was expected). This can lead to unsuspected or hard-to-track bugs.

  Outside of critical hot spots, correctness usually comes before optimisation, but the current design dates back to main.cpp and it may possibly have been chosen to avoid the overhead of dynamic dispatch.  This solution does the same: the class sizes are unchanged and no vptr or vtbl is added.

  There are better designs for doing this that use composition instead of inheritance, or that separate the public interface from the private implementations.  One example of the latter would be a non-virtual public interface that calls private virtual implementation methods, i.e. the Template pattern via the Non-Virtual Interface (NVI) idiom.

ACKs for top commit:
  vasild:
    ACK 3a61fc5

Tree-SHA512: 9ff358ab0a6d010b8f053ad8303c6d4d061e62d9c3755a56b9c9f5eab855d02f02bee42acc77dfa0cbf4bb5cb775daa72d675e1560610a29bd285c46faa85ab7
achow101 added a commit to bitcoin-core/gui that referenced this pull request Feb 17, 2023
…ng() in CService and use better naming

c9d548c net: remove CService::ToStringPort() (Vasil Dimov)
fd4f0f4 gui: simplify OptionsDialog::updateDefaultProxyNets() (Vasil Dimov)
96c791d net: remove CService::ToString() use ToStringAddrPort() instead (Vasil Dimov)
944a9de net: remove CNetAddr::ToString() and use ToStringAddr() instead (Vasil Dimov)
043b9de scripted-diff: rename ToStringIP[Port]() to ToStringAddr[Port]() (Vasil Dimov)

Pull request description:

  Before this PR we had the somewhat confusing combination of methods:

  `CNetAddr::ToStringIP()`
  `CNetAddr::ToString()` (duplicate of the above)
  `CService::ToStringIPPort()`
  `CService::ToString()` (duplicate of the above, overrides a non-virtual method from `CNetAddr`)
  `CService::ToStringPort()`

  Avoid [overriding non-virtual methods](bitcoin/bitcoin#25349).

  "IP" stands for "Internet Protocol" and while sometimes "IP addresses" are called just "IPs", it is incorrect to call Tor or I2P addresses "IPs". Thus use "Addr" instead of "IP".

  Change the above to:

  `CNetAddr::ToStringAddr()`
  `CService::ToStringAddrPort()`

  The changes touch a lot of files, but are mostly mechanical.

ACKs for top commit:
  sipa:
    utACK c9d548c
  achow101:
    ACK c9d548c
  jonatack:
    re-ACK c9d548c only change since my previous reviews is rebase, but as a sanity check rebased to current master and at each commit quickly re-reviewed and re-verified clean build and green unit tests
  LarryRuane:
    ACK c9d548c

Tree-SHA512: 633fb044bdecf9f551b5e3314c385bf10e2b78e8027dc51ec324b66b018da35e5b01f3fbe6295bbc455ea1bcd1a3629de1918d28de510693afaf6a52693f2157
@bitcoin bitcoin locked and limited conversation to collaborators Jul 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants