-
Notifications
You must be signed in to change notification settings - Fork 37.7k
CBlockIndex/CDiskBlockIndex improvements for safety, consistent behavior #25349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CBlockIndex/CDiskBlockIndex improvements for safety, consistent behavior #25349
Conversation
412dfe5
to
905a4d1
Compare
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
The same approach (and a further design refactoring) were also suggested in #25311 (comment). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 905a4d1
Removing unused code is always good IMO. If needed it can be resurrected from git history or re-implemented.
Some non-blocker questions:
Why is overriding non-virtual functions incorrect design?
I think inheritance implies that the derived class wants to have all the properties of the base class. If it wants just some of them, then at some point maybe composition would be more appropriate than inheritance.
The test from the commit message is expected to fail. If you access a Derived object through a pointer to Base, and both have the same method, then the Base's method is to be called.
Edit: now I see that the alternative approach in #25311 (comment) uses composition instead of inheritance. If indeed the users of CDiskBlockIndex
do not use all the stuff provided by CBlockIndex
, then that is a better approach.
905a4d1
to
7866c85
Compare
Updated per
To avoid inconsistent behavior. Public inheritance establishes an invariant over specialization for the derived class. Class usage is confusing when the behavior depends on the pointer definition instead of the object definition (static binding happening where dynamic binding was expected). This can lead to unsuspected or hard-to-track bugs. See Scott Meyers, Effective C++, Item 36: Inherited non-virtual functions should never be redefined, rephrased in urls like https://programmersought.com/article/50007468749/ and https://www.programminghunter.com/article/96792264397/. I updated the PR description with part of this paragraph.
I agree it could potentially be a better design as a follow-up / next step (it is a somewhat more invasive refactoring). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2ececec
to
0e77f1c
Compare
Updated per review feedback by @vasild and @MarcoFalke (thanks!) |
0e77f1c
to
306cd9d
Compare
and remove a now-redundant assert preceding a GetBlockHash() caller. This protects against UB here, and in case of failure (which would indicate a consensus bug), the debug log will print bitcoind: chain.h:265: uint256 CBlockIndex::GetBlockHash() const: Assertion `phashBlock != nullptr' failed. Aborted instead of Segmentation fault
and mark its inherited CBlockIndex#ToString public interface member as deleted, to disallow calling it in the derived CDiskBlockIndex class.
and mark the inherited CBlockIndex#GetBlockHash public interface member as deleted, to disallow calling it in the derived CDiskBlockIndex class. Here is a failing test on master demonstrating the inconsistent behavior of the current design: calling the same inherited public interface functions on the same CDiskBlockIndex object should yield identical behavior. ```diff diff --git a/src/test/validation_chainstatemanager_tests.cpp b/src/test/validation_chainstatemanager_tests.cpp index 6dc522b..dac3840 100644 --- a/src/test/validation_chainstatemanager_tests.cpp +++ b/src/test/validation_chainstatemanager_tests.cpp @@ -240,6 +240,15 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_activate_snapshot, TestChain100Setup) const CBlockIndex* tip = chainman.ActiveTip(); BOOST_CHECK_EQUAL(tip->nChainTx, au_data.nChainTx); + // CDiskBlockIndex "is a" CBlockIndex, as it publicly inherits from it. + // Test that calling the same inherited interface functions on the same + // object yields identical behavior. + CDiskBlockIndex index{tip}; + CBlockIndex *pB = &index; + CDiskBlockIndex *pD = &index; + BOOST_CHECK_EQUAL(pB->GetBlockHash(), pD->GetBlockHash()); + BOOST_CHECK_EQUAL(pB->ToString(), pD->ToString()); + ``` The GetBlockHash() test assertion only passes on master because the different methods invoked by the current design happen to return the same result. If one of the two is changed, it fails like the ToString() assertion does. Redefining inherited non-virtual functions is well-documented as incorrect design to avoid inconsistent behavior (see Scott Meyers, "Effective C++", Item 36). Class usage is confusing when the behavior depends on the pointer definition instead of the object definition (static binding happening where dynamic binding was expected). This can lead to unsuspected or hard-to-track bugs. Outside of critical hot spots, correctness usually comes before optimisation, but the current design dates back to main.cpp and it may possibly have been chosen to avoid the overhead of dynamic dispatch. This solution does the same: the class sizes are unchanged and no vptr or vtbl is added. There are better designs for doing this that use composition instead of inheritance or that separate the public interface from the private implementations. One example of the latter would be a non-virtual public interface that calls private virtual implementation methods, i.e. the Template pattern via the Non-Virtual Interface (NVI) idiom.
which allows dropping tinyformat.h from the header file.
306cd9d
to
3a61fc5
Compare
Updated per review feedback by @MarcoFalke (thanks!) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 3a61fc5
…ety, consistent behavior 3a61fc5 refactor: move CBlockIndex#ToString() from header to implementation (Jon Atack) 57865eb CDiskBlockIndex: rename GetBlockHash() to ConstructBlockHash() (Jon Atack) 99e8ec8 CDiskBlockIndex: remove unused ToString() class member (Jon Atack) 14aeece CBlockIndex: ensure phashBlock is not nullptr before dereferencing (Jon Atack) Pull request description: Fix a few design issues, potential footguns and inconsistent behavior in the CBlockIndex and CDiskBlockIndex classes. - Ensure phashBlock in `CBlockIndex#GetBlockHash()` is not nullptr before dereferencing and remove a now-redundant assert preceding a GetBlockHash() caller. This protects against UB here, and in case of failure (which would indicate a consensus bug), the debug log will print `bitcoind: chain.h:265: uint256 CBlockIndex::GetBlockHash() const: Assertion 'phashBlock != nullptr' failed. Aborted` instead of `Segmentation fault`. - Remove the unused `CDiskBlockIndex#ToString()` class member, and mark the inherited `CBlockIndex#ToString()` public interface member as deleted to disallow calling it in the derived CDiskBlockIndex class. - Rename the `CDiskBlockIndex GetBlockHash()` class member to `ConstructBlockHash()`, which also makes sense as they perform different operations to return a blockhash, and mark the inherited `CBlockIndex#GetBlockHash()` public interface member as deleted to disallow calling it in the derived CDiskBlockIndex class. - Move `CBlockIndex#ToString()` from header to implementation, which also allows dropping `tinyformat.h` from the header file. Rationale and discussion regarding the CDiskBlockIndex changes: Here is a failing test on master that demonstrates the inconsistent behavior of the current design: calling the same inherited public interface functions on the same CDiskBlockIndex object should yield identical behavior, but does not. ```diff diff --git a/src/test/validation_chainstatemanager_tests.cpp b/src/test/validation_chainstatemanager_tests.cpp index 6dc522b..dac3840 100644 --- a/src/test/validation_chainstatemanager_tests.cpp +++ b/src/test/validation_chainstatemanager_tests.cpp @@ -240,6 +240,15 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_activate_snapshot, TestChain100Setup) const CBlockIndex* tip = chainman.ActiveTip(); BOOST_CHECK_EQUAL(tip->nChainTx, au_data.nChainTx); + // CDiskBlockIndex "is a" CBlockIndex, as it publicly inherits from it. + // Test that calling the same inherited interface functions on the same + // object yields identical behavior. + CDiskBlockIndex index{tip}; + CBlockIndex *pB = &index; + CDiskBlockIndex *pD = &index; + BOOST_CHECK_EQUAL(pB->GetBlockHash(), pD->GetBlockHash()); + BOOST_CHECK_EQUAL(pB->ToString(), pD->ToString()); ``` (build and run: `$ ./src/test/test_bitcoin -t validation_chainstatemanager_tests`) The GetBlockHash() test assertion only passes on master because the different methods invoked by the current design happen to return the same result. If one of the two is changed, it fails like the ToString() assertion does. Redefining inherited non-virtual functions is well-documented as incorrect design to avoid inconsistent behavior (see Scott Meyers, Effective C++, Item 36). Class usage is confusing when the behavior depends on the pointer definition instead of the object definition (static binding happening where dynamic binding was expected). This can lead to unsuspected or hard-to-track bugs. Outside of critical hot spots, correctness usually comes before optimisation, but the current design dates back to main.cpp and it may possibly have been chosen to avoid the overhead of dynamic dispatch. This solution does the same: the class sizes are unchanged and no vptr or vtbl is added. There are better designs for doing this that use composition instead of inheritance, or that separate the public interface from the private implementations. One example of the latter would be a non-virtual public interface that calls private virtual implementation methods, i.e. the Template pattern via the Non-Virtual Interface (NVI) idiom. ACKs for top commit: vasild: ACK 3a61fc5 Tree-SHA512: 9ff358ab0a6d010b8f053ad8303c6d4d061e62d9c3755a56b9c9f5eab855d02f02bee42acc77dfa0cbf4bb5cb775daa72d675e1560610a29bd285c46faa85ab7
…ng() in CService and use better naming c9d548c net: remove CService::ToStringPort() (Vasil Dimov) fd4f0f4 gui: simplify OptionsDialog::updateDefaultProxyNets() (Vasil Dimov) 96c791d net: remove CService::ToString() use ToStringAddrPort() instead (Vasil Dimov) 944a9de net: remove CNetAddr::ToString() and use ToStringAddr() instead (Vasil Dimov) 043b9de scripted-diff: rename ToStringIP[Port]() to ToStringAddr[Port]() (Vasil Dimov) Pull request description: Before this PR we had the somewhat confusing combination of methods: `CNetAddr::ToStringIP()` `CNetAddr::ToString()` (duplicate of the above) `CService::ToStringIPPort()` `CService::ToString()` (duplicate of the above, overrides a non-virtual method from `CNetAddr`) `CService::ToStringPort()` Avoid [overriding non-virtual methods](bitcoin/bitcoin#25349). "IP" stands for "Internet Protocol" and while sometimes "IP addresses" are called just "IPs", it is incorrect to call Tor or I2P addresses "IPs". Thus use "Addr" instead of "IP". Change the above to: `CNetAddr::ToStringAddr()` `CService::ToStringAddrPort()` The changes touch a lot of files, but are mostly mechanical. ACKs for top commit: sipa: utACK c9d548c achow101: ACK c9d548c jonatack: re-ACK c9d548c only change since my previous reviews is rebase, but as a sanity check rebased to current master and at each commit quickly re-reviewed and re-verified clean build and green unit tests LarryRuane: ACK c9d548c Tree-SHA512: 633fb044bdecf9f551b5e3314c385bf10e2b78e8027dc51ec324b66b018da35e5b01f3fbe6295bbc455ea1bcd1a3629de1918d28de510693afaf6a52693f2157
Fix a few design issues, potential footguns and inconsistent behavior in the CBlockIndex and CDiskBlockIndex classes.
CBlockIndex#GetBlockHash()
is not nullptr before dereferencing and remove a now-redundant assert preceding a GetBlockHash() caller. This protects against UB here, and in case of failure (which would indicate a consensus bug), the debug log will printbitcoind: chain.h:265: uint256 CBlockIndex::GetBlockHash() const: Assertion 'phashBlock != nullptr' failed. Aborted
instead ofSegmentation fault
.CDiskBlockIndex#ToString()
class member, and mark the inheritedCBlockIndex#ToString()
public interface member as deleted to disallow calling it in the derived CDiskBlockIndex class.CDiskBlockIndex GetBlockHash()
class member toConstructBlockHash()
, which also makes sense as they perform different operations to return a blockhash, and mark the inheritedCBlockIndex#GetBlockHash()
public interface member as deleted to disallow calling it in the derived CDiskBlockIndex class.CBlockIndex#ToString()
from header to implementation, which also allows droppingtinyformat.h
from the header file.Rationale and discussion regarding the CDiskBlockIndex changes:
Here is a failing test on master that demonstrates the inconsistent behavior of the current design: calling the same inherited public interface functions on the same CDiskBlockIndex object should yield identical behavior, but does not.
(build and run:
$ ./src/test/test_bitcoin -t validation_chainstatemanager_tests
)The GetBlockHash() test assertion only passes on master because the different methods invoked by the current design happen to return the same result. If one of the two is changed, it fails like the ToString() assertion does.
Redefining inherited non-virtual functions is well-documented as incorrect design to avoid inconsistent behavior (see Scott Meyers, Effective C++, Item 36). Class usage is confusing when the behavior depends on the pointer definition instead of the object definition (static binding happening where dynamic binding was expected). This can lead to unsuspected or hard-to-track bugs.
Outside of critical hot spots, correctness usually comes before optimisation, but the current design dates back to main.cpp and it may possibly have been chosen to avoid the overhead of dynamic dispatch. This solution does the same: the class sizes are unchanged and no vptr or vtbl is added.
There are better designs for doing this that use composition instead of inheritance, or that separate the public interface from the private implementations. One example of the latter would be a non-virtual public interface that calls private virtual implementation methods, i.e. the Template pattern via the Non-Virtual Interface (NVI) idiom.