Replace global g_cs_orphans lock with local #26295

ajtowns · 2022-10-11T09:09:32Z

Moves extra transactions to be under the m_msgproc_mutex lock rather than g_cs_orphans and refactors orphan handling so that the lock can be internal to the TxOrphange class.

Previously vExtraTxnForCompact and vExtraTxnForCompactIt were protected by g_cs_orphans; protect them by g_msgproc_mutex instead, as they are only used during message processing.

ajtowns · 2022-10-11T09:10:14Z

Resurrects #21527

src/net_processing.cpp

dergoegge · 2022-10-11T11:15:57Z

Concept ACK

-BEGIN VERIFY SCRIPT- sed -i -e 's/static RecursiveMutex/mutable Mutex/' src/txorphanage.h sed -i -e '/RecursiveMutex/d' src/txorphanage.cpp sed -i -e 's/g_cs_orphans/m_mutex/g' $(git grep -l g_cs_orphans src/) -END VERIFY SCRIPT-

DrahtBot · 2022-10-11T17:23:29Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	dergoegge, glozow
Concept ACK	hebasto, chinggg

Conflicts

Reviewers, this pull request conflicts with the following ones:

#26151 (refactor: Guard TxRequestTracker by its own lock instead of cs_main by dergoegge)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

hebasto · 2022-10-12T11:37:26Z

Concept ACK.

hebasto

89e2e0d

... protect them by g_msgproc_mutex instead, as they
are only used during message processing.

It does not look like a justification. An access to the extra txs ring buffer should be synced with its own mutex. It could be locked only within PeerManagerImpl::AddToCompactExtraTransactions and PeerManagerImpl::ProcessMessage functions. The fact that "they
are only used during message processing" will guarantee no performance loss.

6f8e442 "net_processing: Localise orphan_work_set handling to ProcessOrphanTx" looks behavior changing. Do we have tests which cover it?

Tbh, I'd be happy if 4 or 5 first commits could be split into a separated PR.

hebasto · 2022-10-12T12:58:38Z

src/net_processing.cpp

@@ -924,14 +930,14 @@ class PeerManagerImpl final : public PeerManager
    /** Storage for orphan information */
    TxOrphanage m_orphanage;

-    void AddToCompactExtraTransactions(const CTransactionRef& tx) EXCLUSIVE_LOCKS_REQUIRED(g_cs_orphans);
+    void AddToCompactExtraTransactions(const CTransactionRef& tx) EXCLUSIVE_LOCKS_REQUIRED(g_msgproc_mutex);


89e2e0d

Add AssertLockHeld(g_msgproc_mutex); into implementation code?

glozow

Big Concept ACK

ajtowns · 2022-10-17T05:19:38Z

It does not look like a justification. An access to the extra txs ring buffer should be synced with its own mutex

No, it shouldn't have its own mutex: it's always better to have fewer locks than more, as that uses fewer resources and means there are fewer things to keep in your head when following the logic. The only time when more locks are better is when that enables more parallelism, and improves performance -- that is, a justification is needed when introducing new locks.

hebasto · 2022-10-17T11:55:25Z

It does not look like a justification. An access to the extra txs ring buffer should be synced with its own mutex

No, it shouldn't have its own mutex... The only time when more locks are better is when that enables more parallelism, and improves performance -- that is, a justification is needed when introducing new locks.

This implies that in future design changes the vExtraTxnForCompact cannot be processed by any other thread except for the message processing one. Do we really want such a restriction for an ideal networking code?

it's always better to have fewer locks than more, as that uses fewer resources and means there are fewer things to keep in your head when following the logic.

I cannot agree with "it's always better".

Making it use "fewer resources" is a kind of optimization, and "premature optimization is the root of all evil”.

it's always better to have fewer locks than more, as ... there are fewer things to keep in your head when following the logic.

Perhaps, we are talking about personal biases, but, when following the logic, it's better to keep things structured, to have small classes which implement data structures with internal locking and thread-safe interfaces.

ajtowns · 2022-10-18T02:47:11Z

This implies that in future design changes the vExtraTxnForCompact cannot be processed by any other thread except for the message processing one.

What are you talking about? If we can put extra transactions under a different lock now, we certainly can in the future. This isn't an API that we have to support indefinitely.

dergoegge

Code review ACK 7082ce3

All comments are non-blocking (could also be done in a follow up)

dergoegge · 2022-10-18T16:03:25Z

src/txorphanage.h

     */
-    std::pair<CTransactionRef, NodeId> GetTx(const uint256& txid) const EXCLUSIVE_LOCKS_REQUIRED(g_cs_orphans);
+    CTransactionRef GetTxToReconsider(NodeId peer, NodeId& originator, bool& more) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);


Suggested change

CTransactionRef GetTxToReconsider(NodeId peer, NodeId& originator, bool& more) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);

std::optional<TxToReconsider> GetTxToReconsider(NodeId peer) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);

With TxToReconsider being:

struct TxToReconsider { const CTransactionRef tx; const NodeId originator; const bool more; };

That would seem a little cleaner to me because originator and more only have meaning if the returned transaction is not nullptr, so bundling the three would make sense imo.

Was going to comment the same thing, this seems like a better interface given that we only want originator/more if there is a tx to work on, and generally avoiding in-out params is cleaner. (non-blocking)

There was some other followup ideas in the previous PR, particularly #21527 (comment) -- the idea there is that if Alice sends orphan tx Y which has a missing parent X, instead of immediately processing Y when X arrives (which might have arrived from Bob), just put it in Alice's workset, and delay processing of the orphan until it's Alice's turn. Then you'd have peer == originator and not need it to be returned as a result. At least last time I looked at this (the 202104-whohandlesorphans branch), it ended up simplifying the function to just CTransactionRef GetTxToReconsider(NodeId peer) (using nullptr as the "there's nothing to consider" indicator).

Opened this as a draft at #26551

src/net_processing.cpp

dergoegge · 2022-10-18T16:13:55Z

src/txorphanage.cpp

-    AssertLockHeld(g_cs_orphans);
+    LOCK(m_mutex);
+
+    // Get this peer's work set, emplacing an empty set it didn't exist


Suggested change

// Get this peer's work set, emplacing an empty set it didn't exist

// Get this peer's work set, emplacing an empty set if it didn't exist

Comment updated in #26551

dergoegge · 2022-10-18T16:22:54Z

src/net_processing.cpp

 *                                  orphan will be reconsidered on each call of this function. This set
 *                                  may be added to if accepting an orphan causes its children to be
 *                                  reconsidered.
 */
-void PeerManagerImpl::ProcessOrphanTx(std::set<uint256>& orphan_work_set)
+void PeerManagerImpl::ProcessOrphanTx(Peer& peer)


After the work set is moved to the orphanage, ProcessOrphanTx really only needs the id of the peer not the entire Peer.

Maybe pass NodeId to ProcessOrphanTx instead of Peer in 9910ed7 and have it look up the needed Peer.

If you already have a Peer, I don't think there's really any reason not to pass in.

chinggg · 2022-10-30T13:01:33Z

Concept ACK

src/txorphanage.h

glozow

ACK 7082ce3 via code review and some basic testing. I think putting txorphanage in charge of handling peer work sets is the right direction.

glozow · 2022-11-17T15:43:38Z

src/net_processing.cpp

+     * Reconsider orphan transactions after a parent has been accepted to the mempool.
+     *
+     * @peer[in]  peer     The peer whose orphan transactions we will reconsider. Generally only one
+     *                     orphan will be reconsidered on each call of this function. This set


nit in 0027174:

It's a bit confusing what "This set" refers to.

Suggested change

* orphan will be reconsidered on each call of this function. This set

* orphan will be reconsidered on each call of this function. The peer's orphan work set

Comment updated in #26551

glozow · 2022-11-17T17:24:37Z

src/net_processing.cpp

@@ -2885,7 +2885,6 @@ bool PeerManagerImpl::ProcessOrphanTx(Peer& peer)
 {
    AssertLockHeld(g_msgproc_mutex);
    AssertLockHeld(cs_main);
-    AssertLockHeld(g_cs_orphans);


in 733d85f Note to self, my understanding here is:

The two threads that may access orphanage are (1) msg proc and (2) scheduler for calling m_orphanage.EraseForBlock() in BlockConnected().

Previously, we held g_cs_orphans through the entirety of ProcessOrphanTx, and now we are grabbing the lock for each call here (AddChildrenToWorkSet(*porphanTx, peer.m_id), EraseTx(orphanHash)).

And this is fine. The "problem" would be if we called BlockConnected and e.g. evicted an orphan that was just confirmed. It's fine since EraseTx() handles if a tx is already gone, and AddChildrenToworkSet() will wait on g_cs_orphans for EraseForBlock() to be done updating m_outpoint_to_orphan_it.

glozow · 2022-11-17T17:41:06Z

src/txorphanage.h

     */
-    std::pair<CTransactionRef, NodeId> GetTx(const uint256& txid) const EXCLUSIVE_LOCKS_REQUIRED(g_cs_orphans);
+    CTransactionRef GetTxToReconsider(NodeId peer, NodeId& originator, bool& more) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);


Was going to comment the same thing, this seems like a better interface given that we only want originator/more if there is a tx to work on, and generally avoiding in-out params is cleaner. (non-blocking)

glozow · 2022-11-17T17:54:12Z

src/net_processing.cpp


+    while (CTransactionRef porphanTx = m_orphanage.GetTxToReconsider(peer.m_id, from_peer, more)) {


Since you're touching this, I think it's quite strange that we do this in a while(!work_set.empty()) loop when we're trying to make sure that we only ever process 1 orphan at a time. Perhaps a remnant of pre-#15644 code? My suggestion is to take this out of the while loop and replace the breaks with returns... but non-blocking of course, can do in another PR.

There's three "loops" I guess:

ThreadMessageHandler repeatedly calls ProcessMessages calls ProcessMessage calls ProcessOrphanTx; if ProcessOrphanTx returns true, ProcessMessage exits early so the next run of the ThreadMessageHandler loop for this peer immediately goes back to ProcessOrphanTx

while (GetTxToReconsider) here loops until it actually does some work (ProcessTransaction returns VALID or something other than TX_MISSING_INPUTS, returning true if the while loop could have continued

GetTxToReconsider loops over the peer's workset, returning the first entry in the workset that still exists as an orphan

Removing the while (GetTxToReconsider) loop would mean that we'd immediately switch to the next peer anytime we came across an orphan that needed two parents, rather than just one, despite that not taking much time to determine (it's part of MemPoolAccept::PreChecks). I don't think that would be wrong, but the way it is seems better to me.

Removing the internal loop for GetTxToReconsider would mean you'd need to return a new result: "there's more work to do, but the first bit of work was a no-op", and you'd be re-querying m_peer_work_set.find(peer) repeatedly.

c58c249 net_processing: indicate more work to do when orphans are ready to reconsider (Anthony Towns) ecb0a3e net_processing: Don't process tx after processing orphans (Anthony Towns) c583775 net_processing: only process orphans before messages (Anthony Towns) be23046 txorphange: Drop redundant originator arg from GetTxToReconsider (Anthony Towns) a4fe099 txorphanage: index workset by originating peer (Anthony Towns) Pull request description: We currently process orphans by assigning them to the peer that provided a missing parent; instead assign them to the peer that provided the orphan in the first place. This prevents a peer from being able to marginally delay another peer's transactions and also simplifies the internal API slightly. Because we're now associating orphan processing with the peer that provided the orphan originally, we no longer process orphans immediately after receiving the parent, but defer until a future call to `ProcessMessage`. Based on #26295 ACKs for top commit: naumenkogs: utACK c58c249 glozow: ACK c58c249 mzumsande: Code Review ACK c58c249 Tree-SHA512: 3186c346f21e60440266a2a80a9d23d7b96071414e14b2b3bfe50457c04c18b1eab109c3d8c2a7726a6b10a2eda1f0512510a52c102da112820a26f5d96f12de

c58c249 net_processing: indicate more work to do when orphans are ready to reconsider (Anthony Towns) ecb0a3e net_processing: Don't process tx after processing orphans (Anthony Towns) c583775 net_processing: only process orphans before messages (Anthony Towns) be23046 txorphange: Drop redundant originator arg from GetTxToReconsider (Anthony Towns) a4fe099 txorphanage: index workset by originating peer (Anthony Towns) Pull request description: We currently process orphans by assigning them to the peer that provided a missing parent; instead assign them to the peer that provided the orphan in the first place. This prevents a peer from being able to marginally delay another peer's transactions and also simplifies the internal API slightly. Because we're now associating orphan processing with the peer that provided the orphan originally, we no longer process orphans immediately after receiving the parent, but defer until a future call to `ProcessMessage`. Based on bitcoin#26295 ACKs for top commit: naumenkogs: utACK c58c249 glozow: ACK c58c249 mzumsande: Code Review ACK c58c249 Tree-SHA512: 3186c346f21e60440266a2a80a9d23d7b96071414e14b2b3bfe50457c04c18b1eab109c3d8c2a7726a6b10a2eda1f0512510a52c102da112820a26f5d96f12de

ajtowns added 7 commits October 7, 2022 14:39

Remove unnecessary includes of txorphange.h

ff8d44d

net_processing: move extra transactions to msgproc mutex

89e2e0d

Previously vExtraTxnForCompact and vExtraTxnForCompactIt were protected by g_cs_orphans; protect them by g_msgproc_mutex instead, as they are only used during message processing.

net_processing: Pass a Peer& to ProcessOrphanTx

9910ed7

net_processing: move ProcessOrphanTx docs to declaration

0027174

net_processing: Localise orphan_work_set handling to ProcessOrphanTx

6f8e442

txorphange: move orphan workset to txorphanage

3614819

txorphanage: make m_peer_work_set private

a936f41

maflcko reviewed Oct 11, 2022

View reviewed changes

src/net_processing.cpp Outdated Show resolved Hide resolved

ajtowns added 2 commits October 11, 2022 23:35

Move all g_cs_orphans locking to txorphanage

733d85f

scripted-diff: rename and de-globalise g_cs_orphans

7082ce3

-BEGIN VERIFY SCRIPT- sed -i -e 's/static RecursiveMutex/mutable Mutex/' src/txorphanage.h sed -i -e '/RecursiveMutex/d' src/txorphanage.cpp sed -i -e 's/g_cs_orphans/m_mutex/g' $(git grep -l g_cs_orphans src/) -END VERIFY SCRIPT-

ajtowns force-pushed the 202210-cs_orphans branch from 5d59755 to 7082ce3 Compare October 11, 2022 13:45

This was referenced Oct 11, 2022

refactor: Make m_mempool optional in PeerManager #26247

Closed

refactor: Guard TxRequestTracker by its own lock instead of cs_main #26151

Closed

refactor: Split util/system into exception, shell, and fs-specific files #25152

Closed

hebasto reviewed Oct 12, 2022

View reviewed changes

glozow reviewed Oct 12, 2022

View reviewed changes

dergoegge reviewed Oct 18, 2022

View reviewed changes

chinggg reviewed Oct 30, 2022

View reviewed changes

src/txorphanage.h Show resolved Hide resolved

glozow reviewed Nov 17, 2022

View reviewed changes

ajtowns mentioned this pull request Nov 21, 2022

p2p: Track orphans by who provided them #26551

Merged

glozow merged commit a79b720 into bitcoin:master Nov 28, 2022

sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Dec 1, 2022

Merge bitcoin#26295: Replace global g_cs_orphans lock with local

395a6c7

bitcoin locked and limited conversation to collaborators Nov 29, 2023

	CTransactionRef GetTxToReconsider(NodeId peer, NodeId& originator, bool& more) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
	std::optional<TxToReconsider> GetTxToReconsider(NodeId peer) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);

	// Get this peer's work set, emplacing an empty set it didn't exist
	// Get this peer's work set, emplacing an empty set if it didn't exist

	* orphan will be reconsidered on each call of this function. This set
	* orphan will be reconsidered on each call of this function. The peer's orphan work set


		while (CTransactionRef porphanTx = m_orphanage.GetTxToReconsider(peer.m_id, from_peer, more)) {

Replace global g_cs_orphans lock with local #26295

Replace global g_cs_orphans lock with local #26295

Uh oh!

Conversation

ajtowns commented Oct 11, 2022

Uh oh!

ajtowns commented Oct 11, 2022

Uh oh!

Uh oh!

dergoegge commented Oct 11, 2022

Uh oh!

DrahtBot commented Oct 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Conflicts

Uh oh!

hebasto commented Oct 12, 2022

Uh oh!

hebasto left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glozow left a comment

Choose a reason for hiding this comment

Uh oh!

ajtowns commented Oct 17, 2022

Uh oh!

hebasto commented Oct 17, 2022

Uh oh!

ajtowns commented Oct 18, 2022

Uh oh!

dergoegge left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ajtowns Nov 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chinggg commented Oct 30, 2022

Uh oh!

Uh oh!

glozow left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DrahtBot commented Oct 11, 2022 •

edited

Loading

hebasto left a comment •

edited

Loading

ajtowns Nov 18, 2022 •

edited

Loading