Skip to content

Conversation

jamesob
Copy link
Contributor

@jamesob jamesob commented Jun 8, 2022

Copy construction of CBlockIndex objects is a footgun because of the
wide use of equality-by-pointer comparison in the code base. There are
also potential lifetime confusions of using copied instances, since
there are recursive pointer members (e.g. pprev).

(See also #24008 (comment))

We can't just delete the copy constructors because they are used for
derived classes (CDiskBlockIndex), so we mark them protected.

@jamesob
Copy link
Contributor Author

jamesob commented Jun 8, 2022

This can be tested trivially by adding a construction-by-copy somewhere and verifying the compiler rejects it:

% make
Making all in src
make[1]: Entering directory '/home/james/src/bitcoin/src'
make[2]: Entering directory '/home/james/src/bitcoin/src'
make[3]: Entering directory '/home/james/src/bitcoin'
make[3]: Leaving directory '/home/james/src/bitcoin'
  CXX      libbitcoin_node_a-validation.o
  CXX      libbitcoin_util_a-clientversion.o
  AR       libbitcoin_util.a
  CXXLD    bitcoin-util
  CXXLD    bitcoin-wallet
  CXXLD    bitcoin-cli
  CXXLD    bitcoin-tx
validation.cpp:4170:29: error: calling a protected constructor of class 'CBlockIndex'
            auto newblock = CBlockIndex(*block);
                            ^
./chain.h:369:5: note: declared protected here
    CBlockIndex(const CBlockIndex&) = default;
    ^
1 error generated.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

@jonatack
Copy link
Member

jonatack commented Jun 8, 2022

Concept ACK

@jamesob jamesob force-pushed the 2022-06-cblockindex-delete-copy branch 2 times, most recently from 9d080cb to 62f6e49 Compare June 8, 2022 18:04
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditional code review ACK 62f6e49 if the virtual method is dropped

@jamesob jamesob force-pushed the 2022-06-cblockindex-delete-copy branch from 62f6e49 to 0497d76 Compare June 9, 2022 19:53
@jonatack
Copy link
Member

jonatack commented Jun 10, 2022

IIUC correctly, the issue here is that of (or similar to) slicing: a pointer to a derived class implicitly converts to one to its public base class, and with copy operations this becomes a subtle footgun. Looking through various resources, two solutions appear to be:

  • disallow copying of the base class by deleting the copy operations or making them private or protected, and providing a clone() function if needed to allow a user to request a non-slicing copy, or
  • disallow converting a pointer to the derived class to a pointer to a base by making the base class a private or protected base

This PR opts for the first solution, which seems fine. This approach is also suggested at the bottom of https://en.cppreference.com/w/cpp/language/rule_of_three and in the guideline "A polymorphic class should suppress public copy/move".

The last push removing the public virtual destructor in the base class involves some tradeoffs that might be discussed. Not declaring it reduces the size of CBlockIndex and CDiskBlockIndex from 152 to 144 and 184 to 176 bytes, respectively, a saving of 8 bytes due to not needing a hidden vptr class data member pointer (as well as avoiding a vtbl + pointer entry and dynamic dispatch in general). In most cases, those additional costs are considered relatively low, so declaring destructors virtual is a good practice as a safety mechanism for unintended polymorphic destruction (destructing derived class objects through pointers or references of base class types).

Not declaring the virtual destructor is also only possible due to issues with the current design. Inherited non-virtual functions should never be redefined to avoid inconsistent behavior (Scott Meyers, Effective C++, Item 36), yet that is what we're currently doing with GetBlockHash() and ToString(), either from oversight or as a hack to avoid making them virtual. Items 34 and 35 in that book suggest alternatives (Template Pattern/NVI Idiom, Strategy Patterns), but as a start it may be prudent and less footgunny for future reviewers and bug avoidance to consider keeping the explicit virtual public destructor in the base and making those two functions virtual in CBlockIndex and override in the derived CDiskBlockIndex.

If the overhead of dynamic dispatch is unacceptable, maybe there is a better/safer design than the current one, but if we intend for CBlockIndex to be inherited from, consider if its destructor and redefined methods ought to be virtual. If CBlockIndex is not intended to be inherited from, it can be marked as final to prevent that in the first place, without imposing any other use restrictions on the class itself.

I'm not an expert, so have been trying to understand this area better from researching it. A couple related resources:

Scott Meyers, Effective C++, Item 8: Polymorphic base classes should declare virtual destructors. If a class has any virtual functions, it should have a virtual destructor. Classes not designed to be base classes or not designed to be used polymorphically should not declare virtual destructors.

Stroustroup, https://www.stroustrup.com/C++11FAQ.html#default2, I strongly recommend that if you declare one of these five functions, you explicitly declare all. Think of copying, moving, and destruction as closely related operations, rather than individual operations that you can freely mix and match - you can specify arbitrary combinations, but only a few combinations make sense semantically.

Herb Sutter, http://www.gotw.ca/publications/mill18.htm, Guideline 4: A base class destructor should be either public and virtual, or protected and nonvirtual. / Don't derive from concrete classes. Or, as Scott Meyers puts it in Item 33 of More Effective C++, "Make non-leaf classes abstract."

@maflcko
Copy link
Member

maflcko commented Jun 10, 2022

IIUC correctly, the issue here is that of (or similar to) slicing: a pointer to a derived class implicitly converts to one to its public base class, and with copy operations this becomes a subtle footgun

This may or may not be a problem, but was not the primary motivation I mentioned. The motivation is simply that creating an exact copy will (obviously) create a new memory location where the copy lives (aka pointer). This is problematic when the value of the pointer is used to check for equality/unequality of objects.

For example,

bool IsTip(const CBlockIndex& b) { return &b == ::ActiveTip(); }
int main() {
 const CBlockIndex tip{*Assert(::ActiveTip())};
 Assert(IsTip(&tip));
}

will fail.

@ryanofsky
Copy link
Contributor

Appreciate the thorough response jonatack , and agree declaring a virtual destructor could prevent slicing issues if virtual methods were added in the future, and also agree that use of inheritance in CBlockIndex is pretty sketchy, with overridden nonvirtual methods being a sign of that.

I may be biased because I tend to think of classes with virtual methods designed to be overridden as pretty different things from simple classes that just encapsulate some state, but I wouldn't think it is a good tradeoff to make class destructors virtual by default just because there is a chance someone may inherit from the class and try to use it polymorphically in the future. I think whoever is doing this later should be able to spot problems and that in the worst case "class has virtual method but non-virtual destructor" compiler warnings should help them.

It's hard to think about future-bugs that don't yet exist in a complex language like c++, and I think c++ guidelines are good for generating ideas about bugs that may happen. But after you decide to prevent a bug, it should be possible to say what the bug is without citing guidelines as an authority. It is easy to go in circles thinking about guidelines abstractly and I think it is better to keep things concrete.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 0497d76, only change since last review is CBlockIndex stops having a vtable again

@jonatack
Copy link
Member

jonatack commented Jun 11, 2022

The last sentence of the commit message needs to be updated, otherwise tested ACK 0497d76 code review, verified that this change fixes both slicing and the failing pointer equality check of an exact copy with the following diff partly derived from #25311 (comment):

unit test diff

--- a/src/test/validation_chainstatemanager_tests.cpp
+++ b/src/test/validation_chainstatemanager_tests.cpp
@@ -24,6 +24,23 @@ using node::SnapshotMetadata;

+static bool IsTip(const CBlockIndex& b, const ChainstateManager& m)
+{
+    return &b == m.ActiveTip();
+}
+
+static void naive(CBlockIndex* p)
+{
+    CBlockIndex b = *p; // may slice: invokes CBlockIndex:CBlockIndex(const CBlockIndex&), runs on master, raises a build error after this change
+}
+
+static void user()
+{
+    CDiskBlockIndex d;
+    naive(&d);
+    CBlockIndex bb = d; // slices: invokes CBlockIndex:CBlockIndex(const CBlockIndex&) rather than CDiskBlockIndex:CDiskBlockIndex(const CDiskBlockIndex&), runs on master, raises a build error after this change
+}
+
 //! Basic tests for ChainstateManager.
@@ -310,6 +327,11 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_activate_snapshot, TestChain100Setup)
     BOOST_CHECK_EQUAL(
         *chainman.ActiveChainstate().m_from_snapshot_blockhash,
         loaded_snapshot_blockhash);
+
+    // Pointer equality sanity check after exact copy.
+    const CBlockIndex index{*Assert(chainman.ActiveTip())}; // runs on master, raises a build error after this change
+    BOOST_CHECK(IsTip(index, chainman)); // fails on master
+    user(); // to avoid -Werror for unused function
 }

@jonatack
Copy link
Member

@ryanofsky Good points. To follow up on our discussion, I wrote a failing test and implementation of a non-virtual public interface that calls private virtual implementation methods, i.e. the Template pattern via the Non-Virtual Interface (NVI) idiom, that makes the test pass in https://github.com/jonatack/bitcoin/commits/CBlockIndex-CDiskBlockIndex-class-design. Then, based on our discussion here, opted to propose a simpler change in #25349 that doesn't add a vptr or vtable.

Copy link
Contributor

@ajtowns ajtowns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

@@ -25,17 +25,18 @@ FUZZ_TARGET_INIT(pow, initialize_pow)
{
FuzzedDataProvider fuzzed_data_provider(buffer.data(), buffer.size());
const Consensus::Params& consensus_params = Params().GetConsensus();
std::vector<CBlockIndex> blocks;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is already buggy if the vector were to ever be moved while being resized? Could have been a std::array<CBlockIndex, 10000> perhaps.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jun 15, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK MarcoFalke, ajtowns
Stale ACK ryanofsky, jonatack

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

@jonatack
Copy link
Member

@jamesob update with @ajtowns review feedback? The rebase should be trivial.

@jamesob
Copy link
Contributor Author

jamesob commented Jul 25, 2022

Thanks for the ping @jonatack, will get back to this in the next day or so.

@ajtowns
Copy link
Contributor

ajtowns commented Sep 8, 2022

Delete move constructors and declare the destructor to satisfy the
"rule of 5."

Code doesn't seem to actually declare the destructor. Maybe:

-    CBlockIndex()
-    {
-    }
+    CBlockIndex() = default;
+    ~CBlockIndex() = default;

@jamesob jamesob force-pushed the 2022-06-cblockindex-delete-copy branch from 5b84eb8 to 89fcfc8 Compare September 9, 2022 18:20
@jamesob
Copy link
Contributor Author

jamesob commented Sep 9, 2022

Pushed feedback from @ajtowns, @MarcoFalke. Thanks guys.

@achow101
Copy link
Member

cc @jonatack @LarryRuane @ryanofsky

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

Left two nits. Happy to re-ACK, or leave them ignored.

review ACK 89fcfc8 🍲

Show signature

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

review ACK 89fcfc872eebd8a177561189575f67da27752cb1 🍲
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUiSNQwAicS4GLjofIlttbHg6E36/ZFqeRcuM7JGwoRO/dX3cpVFtBCB+3G5rObS
RN/AEZ5eWyELNicp2/q+1kJf0LyBkqPXLMVDWqT4gf54fVCdX26orqGUqmHTl8HY
Cxw8wi+x48YTb/+iLPpbpFqLNoE82oypyKNDBCR8O8AOt1fKeIoNI0ZbvgFWTGjy
mYYN+Noi2MPWW9aa6Q7Y2LL/bdpMfLttj7E6Gt+XTTxpeX5IrogR7tGpUbBIbuBv
a+Px6BXakK+nOweLGnOH0lYcEJOqFIqWyqvP6PUevGSgtYlZm6Z+X7ZFp/btWj4q
VJIB7Dn/g8XhBm3Up9YGEqtfsbOZlihhovoBp24nXOk8BGb4WsaKJzCPqku7o9Px
Ee7w44W+cfH8OKs/3g0+2z0wwSqzA5JLxxpezhabiUclfKGaxJhbczWHJxWTktRr
91p8KV+MdDaers1ZI5QOXs+eKrsSP4CjdiQnMMGgrw/ua1IbTmjntKSlwY7/Vq4a
kjYKqGut
=d+cV
-----END PGP SIGNATURE-----

@maflcko maflcko changed the title remove CBlockIndex copy construction refactor: remove CBlockIndex copy construction Dec 15, 2022
@jamesob jamesob force-pushed the 2022-06-cblockindex-delete-copy branch from 89fcfc8 to 36c201f Compare December 15, 2022 19:51
Copy construction of CBlockIndex objects is a footgun because of the
wide use of equality-by-pointer comparison in the code base. There are
also potential lifetime confusions of using copied instances, since
there are recursive pointer references (e.g. pprev).

We can't just delete the copy constructors because they are used for
derived classes (CDiskBlockIndex), so we mark them protected.

Delete move constructors and declare the destructor to satisfy the
"rule of 5."
Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 36c201f 🏻

Show signature

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

re-ACK 36c201feb74bbb87d22bd956373dbbb9c47fb7e7  🏻
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUgR+wwAl9CSAv7TffHzPRCAHtkQzy2bADeuYb13/tGZmht477Ny9r3DNVPI2TRU
r7rwlAH836ssp4Xqxq/y2EIrerKJ7nY1GS9rdELrdJjUMRpa3rVDgG0fLPcp5cKU
or0XHvezB1a4PO8CAeiSuM9tavWQki+2CntslLEofVXcqcp/HZbl9tbENdtIZu/z
rxHgaZfFSJ+B1a4JmXzybj56SSX1AFrLLbyBX61rNSAqVlHmJDmzRGjJrcK1P13m
l0c8H8MeeHNFxw+cV7gB6bca9jDw56oHX+bDswCYMUARUevCZME50x0ny9d6EgWN
LSJ1QQGvk+5GgXlbK8vEbsRnYzGKi/M8GOlSTJV0c1jk2eloWXmkclkPKdf+V50a
V7T12GjTgnuQtDJcX0hpJ9VSP4EQ+9qTmzdjFLsavx2Ad/kRbrZh/Dck3iXOPcJZ
WpJ3QkwXPW62hfSXW6ZKZwBDShLr8eCfC+JAyyidmuSbVjTLL0Xu0Z8K9KPtv8eD
wXNA+JYi
=9oo/
-----END PGP SIGNATURE-----

@ajtowns
Copy link
Contributor

ajtowns commented Dec 18, 2022

ACK 36c201f - code review only

@fanquake fanquake merged commit 65f5cfd into bitcoin:master Dec 19, 2022
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Dec 19, 2022
@bitcoin bitcoin locked and limited conversation to collaborators Dec 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

9 participants