Skip to content

Conversation

vasild
Copy link
Contributor

@vasild vasild commented Mar 18, 2020

In PeerLogicValidation::PeerLogicValidation() we would schedule a lambda
function to execute later, capturing the local variable
consensusParams by reference.

Presumably this was considered safe because consensusParams is a
reference itself to a global variable which is not supposed to change,
but it can in tests.

Fixes #18372

In PeerLogicValidation::PeerLogicValidation() we would schedule a lambda
function to execute later, capturing the local variable
`consensusParams` by reference.

Presumably this was considered safe because `consensusParams` is a
reference itself to a global variable which is not supposed to change,
but it can in tests.

Fixes bitcoin#18372
@fanquake fanquake added the P2P label Mar 18, 2020
@fanquake fanquake requested a review from maflcko March 18, 2020 12:24
@vasild
Copy link
Contributor Author

vasild commented Mar 18, 2020

Configure and run as:
./configure --with-sanitizers=thread
./src/test/test_bitcoin --run_test="txvalidationcache_tests/checkinputs_test"

I confirm the following:

  • Latest master @ ce87d56 shows the use-after-free bug
  • This PR fixes the bug
  • Reverting fadafb8 on top of the latest master git show fadafb83c |git apply -R -3 (+ trivial conflict resolution) fixes the bug
  • The failure seems to be deterministic

So, this was introduced in #18289. I have no clue why travis was green for it.

-    scheduler.scheduleEvery(std::bind(&PeerLogicValidation::CheckForStaleTipAndEvictPeers, this, consensusParams), EXTRA_PEER_CHECK_INTERVAL * 1000);
+    scheduler.scheduleEvery([&] { this->CheckForStaleTipAndEvictPeers(consensusParams); }, std::chrono::seconds{EXTRA_PEER_CHECK_INTERVAL});

Indeed std::bind() copies the arguments whereas the introduced lambda function passes them by reference and will brick if the argument is destroyed before the function is called.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 7d8e1de

In PeerLogicValidation::PeerLogicValidation() we would schedule a lambda
function to execute later, capturing the local variable
consensusParams by reference.

Presumably this was considered safe because consensusParams is a
reference itself to a global variable which is not supposed to change,
but it can in tests.

Correct, it is a reference to a global, which I assumed to not change after initialization. This does not hold for unit test, for example the txvalidation one:

TestChain100Setup::TestChain100Setup()
{
// CreateAndProcessBlock() does not support building SegWit blocks, so don't activate in these tests.
// TODO: fix the code to support SegWit blocks.
gArgs.ForceSetArg("-segwitheight", "432");
// Need to recreate chainparams
SelectParams(CBaseChainParams::REGTEST);

@@ -1127,7 +1127,7 @@ PeerLogicValidation::PeerLogicValidation(CConnman* connmanIn, BanMan* banman, CS
// combine them in one function and schedule at the quicker (peer-eviction)
// timer.
static_assert(EXTRA_PEER_CHECK_INTERVAL < STALE_CHECK_INTERVAL, "peer eviction timer should be less than stale tip check timer");
scheduler.scheduleEvery([&] { this->CheckForStaleTipAndEvictPeers(consensusParams); }, std::chrono::seconds{EXTRA_PEER_CHECK_INTERVAL});
scheduler.scheduleEvery([this, consensusParams] { this->CheckForStaleTipAndEvictPeers(consensusParams); }, std::chrono::seconds{EXTRA_PEER_CHECK_INTERVAL});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be made = to just mirror what std::bind did. (this is a pointer, so it shouldn't matter whether it is copied or referenced, and for consensusParams we want it to be copied).

Suggested change
scheduler.scheduleEvery([this, consensusParams] { this->CheckForStaleTipAndEvictPeers(consensusParams); }, std::chrono::seconds{EXTRA_PEER_CHECK_INTERVAL});
scheduler.scheduleEvery([=] { this->CheckForStaleTipAndEvictPeers(consensusParams); }, std::chrono::seconds{EXTRA_PEER_CHECK_INTERVAL});

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The = is also what is recommended in the "Tutorial" (written by me, so count that against me):

// s->scheduleFromNow([=] { this->func(argument); }, std::chrono::milliseconds{3});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh! I was about to apply this suggestion, but then realized that The implicit capture of *this when the capture default is = is deprecated. (since C++20) :-/ Indeed:

        auto m() {
            return [=](){
                std::cout << "lambda: this=" << this << std::endl;
            };
        }

Produces:

clang++10 -std=c++20 -Wdeprecated t.cc -o t
t.cc:96:49: warning: implicit capture of 'this' with a capture default of '=' is deprecated
      [-Wdeprecated-this-capture]
                std::cout << "lambda: this=" << this << std::endl;
                                                ^
t.cc:95:21: note: add an explicit capture of 'this' to capture '*this' by reference
            return [=](){
                    ^
                     , this

Better not add code that would have to be fixed later, so I would rather keep it as is.

What do you think?

PS this is always captured by reference and I think all of the following do the same wrt this: [=], [this], [&], std::bind

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I'd prefer to use [=, this], but this might not compile either on our current C++ target? According to the page you linked:

    [=, this] {};   // until C++20: Error: this when = is the default
                    // since C++20: OK, same as [=]

So, given that we won't be switching to C++20 any time soon, I'd still prefer the [=]. Then it can be changed, if and when we switch to C++20?

Just a nit, though. ACK from me either way.

@practicalswift
Copy link
Contributor

Concept ACK -- premature optimisation is the root of many lifetime issues :)

Somewhat related: Investigate potential lifetime issues in cases where we are returning "const std::string&".

This is a great talk about recurring C++ bugs at Facebook: this segment on lifetime issues due to premature optimization might have some relevance here :)

@maflcko
Copy link
Member

maflcko commented Mar 18, 2020

@practicalswift I don't think it is helpful to call this premature optimization. This was not an optimization attempt. GetConsensus returns a const reference because this global is assumed to be initialized once and then never changed. It has been that way for years, and I presume well known by myself and all reviewers of the pull. See also bd00611. And logically, it wouldn't make sense for a running Bitcoin Core instance to change the consensus rules "on the fly" in a running process.

I think it was reasonable to keep it a reference and pass it along in the scheduler. And I assume the four people who reviewed my pull request agreed with me.

That some unit tests spin up a PeerLogicValidation and then change the consensus parameters while running is documented as a temporary hack in the test code. And the oversight was that not everyone had this temporary hack in mind while writing/reviewing the code.

@sipa
Copy link
Member

sipa commented Mar 18, 2020

ACK 7d8e1de

@practicalswift
Copy link
Contributor

ACK 7d8e1de

[…] premature optimisation is the root of many lifetime issues :)

I don't think it is helpful to call this premature optimization.

Point taken :)

@maflcko maflcko merged commit a421e0a into bitcoin:master Mar 18, 2020
fanquake added a commit that referenced this pull request Aug 26, 2020
fad84b7 test: Activate segwit in TestChain100Setup (MarcoFalke)
fa11ff2 test: Pass empty tx pool to block assembler (MarcoFalke)
fa96574 test: Move doxygen comment to header (MarcoFalke)

Pull request description:

  This fixes not only a TODO in the code, but also prevents a never ending source of uninitialized reads. E.g.

  * #18376
  * #19704 (comment)
  * ...

ACKs for top commit:
  jnewbery:
    utACK fad84b7

Tree-SHA512: 64cf16a59656d49e022b603f3b06441ceae35a33a4253b4382bc8a89a56e08ad5412c8fa734d0fc7b58586f40ea6d57b348a3b4838bc6890a41ae2ec3902e378
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Aug 26, 2020
fad84b7 test: Activate segwit in TestChain100Setup (MarcoFalke)
fa11ff2 test: Pass empty tx pool to block assembler (MarcoFalke)
fa96574 test: Move doxygen comment to header (MarcoFalke)

Pull request description:

  This fixes not only a TODO in the code, but also prevents a never ending source of uninitialized reads. E.g.

  * bitcoin#18376
  * bitcoin#19704 (comment)
  * ...

ACKs for top commit:
  jnewbery:
    utACK fad84b7

Tree-SHA512: 64cf16a59656d49e022b603f3b06441ceae35a33a4253b4382bc8a89a56e08ad5412c8fa734d0fc7b58586f40ea6d57b348a3b4838bc6890a41ae2ec3902e378
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sanitizer: heap-use-after-free in checkinputs_test
5 participants