Skip to content

Conversation

theStack
Copy link
Contributor

This PR is related to #19303 and gets rid of the RecursiveMutex cs_vProcessMsg. Both of the critical sections only directly access the guarded elements, i.e. it is not possible that within one section another one is called, and we can use a regular Mutex:

bitcoin/src/net.cpp

Lines 1597 to 1602 in e3ce019

{
LOCK(pnode->cs_vProcessMsg);
pnode->vProcessMsg.splice(pnode->vProcessMsg.end(), pnode->vRecvMsg, pnode->vRecvMsg.begin(), it);
pnode->nProcessQueueSize += nSizeAdded;
pnode->fPauseRecv = pnode->nProcessQueueSize > nReceiveFloodSize;
}

bitcoin/src/net_processing.cpp

Lines 4142 to 4150 in e3ce019

{
LOCK(pfrom->cs_vProcessMsg);
if (pfrom->vProcessMsg.empty()) return false;
// Just take one message
msgs.splice(msgs.begin(), pfrom->vProcessMsg, pfrom->vProcessMsg.begin());
pfrom->nProcessQueueSize -= msgs.front().m_raw_message_size;
pfrom->fPauseRecv = pfrom->nProcessQueueSize > m_connman.GetReceiveFloodSize();
fMoreWork = !pfrom->vProcessMsg.empty();
}

Also, it is renamed to adapt to the (unwritten) naming convention to use the _mutex suffix rather than the cs_ prefix.

-BEGIN VERIFY SCRIPT-
sed -i 's/cs_vProcessMsg/m_process_msgs_mutex/g' $(git grep -l cs_vProcessMsg)
sed -i 's/vProcessMsg/m_process_msgs/g' $(git grep -l vProcessMsg)
-END VERIFY SCRIPT-
In each of the critical sections, only the the m_process_msgs member is
accessed, without any chance that within one section another one is
called.  Hence, we can use an ordinary Mutex instead of RecursiveMutex.
Copy link
Contributor

@shaavan shaavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

I was thinking, should we use AssertLockNotHeld(m_process_msgs_mutex) before locking the mutex and add thread safety annotation LOCKS_EXCLUDED(m_process_msgs_mutex), with the function declaration of where this mutex is used?

Doing so will prevent any risk of error due to simultaneous locking of this mutex.

Copy link
Contributor

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crACK 9a355e1

I agree with @shaavan .

It might be nice to add AssertLockNotHeld(mutex) macros combined with LOCKS_EXCLUDED(mutex) thread safety annotations. This avoids any risk of recursive locking.

@theStack
Copy link
Contributor Author

@shaavan @w0xlt: Thanks for your feedback, I agree that adding thread safety assertions and annotations is a good idea. Added a commit doing that -- note though that the LOCKS_EXCLUDED annotation is only possible for one method (PeerManagerImpl::ProcessMessages), as for the other (CConnman::SocketHandlerConnected) the elements that cointain the loop are in a list rather than a direct parameter that can be accessed.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 23, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #25109 (Strengthen AssertLockNotHeld assertions by ajtowns)
  • #24931 (Strengthen thread safety assertions by ajtowns)
  • #24474 (Additional thread safety annotations for CNode/Peer by ajtowns)
  • #21527 (net_processing: lock clean up by ajtowns)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Contributor

@shaavan shaavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as for the other (CConnman::SocketHandlerConnected) the elements that cointain the loop are in a list rather than a direct parameter that can be accessed.

Sounds interesting. Let me take another look at this before I can be 100% sure. In the meantime, I had one observation that I would like to discuss.

@@ -22,8 +22,8 @@ void ConnmanTestMsg::NodeReceiveMsgBytes(CNode& node, Span<const uint8_t> msg_by
nSizeAdded += it->m_raw_message_size;
}
{
LOCK(node.cs_vProcessMsg);
node.vProcessMsg.splice(node.vProcessMsg.end(), node.vRecvMsg, node.vRecvMsg.begin(), it);
LOCK(node.m_process_msgs_mutex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add AssertLockNotHeld above this line?

Suggested change
LOCK(node.m_process_msgs_mutex);
AssertLockNotHeld(node.m_process_msgs_mutex);
LOCK(node.m_process_msgs_mutex);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, done! Also added the thread safety annotation to the corresponding method ConnmanTestMsg::NodeReceiveMsgBytes.

@theStack theStack force-pushed the 202201-refactor_replace_RecursiveMutex_cs_vProcess branch from 9add3f9 to b819ae4 Compare January 24, 2022 13:44
Co-authored-by: Shashwat <svangani239@gmail.com>
@theStack theStack force-pushed the 202201-refactor_replace_RecursiveMutex_cs_vProcess branch from b819ae4 to 9c0b268 Compare January 24, 2022 13:45
Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

Following best practices, mutexes should synchronize access to internal class data to prevent violation of class invariants.

The current implementation has a significant design drawback -- external synchronizing, i.e., acquiring CNode::m_process_msgs_mutex from outside of the CNode instance.

Therefore, I'm suggesting to consider to encapsulate access to the private members m_process_msgs and nProcessQueueSize with internal synchronizing using m_process_msgs_mutex.

One new CNode member function could implement

bitcoin/src/net.cpp

Lines 1592 to 1604 in e3de7cb

size_t nSizeAdded = 0;
auto it(pnode->vRecvMsg.begin());
for (; it != pnode->vRecvMsg.end(); ++it) {
// vRecvMsg contains only completed CNetMessage
// the single possible partially deserialized message are held by TransportDeserializer
nSizeAdded += it->m_raw_message_size;
}
{
LOCK(pnode->cs_vProcessMsg);
pnode->vProcessMsg.splice(pnode->vProcessMsg.end(), pnode->vRecvMsg, pnode->vRecvMsg.begin(), it);
pnode->nProcessQueueSize += nSizeAdded;
pnode->fPauseRecv = pnode->nProcessQueueSize > nReceiveFloodSize;
}

Another one

bitcoin/src/net_processing.cpp

Lines 4165 to 4173 in e3de7cb

{
LOCK(pfrom->cs_vProcessMsg);
if (pfrom->vProcessMsg.empty()) return false;
// Just take one message
msgs.splice(msgs.begin(), pfrom->vProcessMsg, pfrom->vProcessMsg.begin());
pfrom->nProcessQueueSize -= msgs.front().m_raw_message_size;
pfrom->fPauseRecv = pfrom->nProcessQueueSize > m_connman.GetReceiveFloodSize();
fMoreWork = !pfrom->vProcessMsg.empty();
}

RecursiveMutex cs_vProcessMsg;
std::list<CNetMessage> vProcessMsg GUARDED_BY(cs_vProcessMsg);
Mutex m_process_msgs_mutex;
std::list<CNetMessage> m_process_msgs GUARDED_BY(m_process_msgs_mutex);
size_t nProcessQueueSize{0};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the code makes me strongly believe that the value of nProcessQueueSize must correspond to the m_process_msgs content. To preserve this invariant of the CNode class I'm suggesting:

Suggested change
size_t nProcessQueueSize{0};
size_t nProcessQueueSize GUARDED_BY(m_process_msgs_mutex){0};

@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".

@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 1, 2022

There hasn't been much activity lately and the patch still needs rebase. What is the status here?
  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

@achow101
Copy link
Member

Are you still working on this?

@achow101
Copy link
Member

Closing this as it has not had any activity in a while. If you are interested in continuing work on this, please leave a comment so that it can be reopened.

@achow101 achow101 closed this Nov 10, 2022
@bitcoin bitcoin locked and limited conversation to collaborators Nov 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants