-
Notifications
You must be signed in to change notification settings - Fork 37.7k
WIP: switch to libevent for node socket handling #11227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bc185e7
to
62d4d49
Compare
I believe that the proxy_test failure is a false-positive. Looks like the issue is that disconnection happens so quickly now after an error that the testnode fails. |
62d4d49
to
7274a18
Compare
src/net.cpp
Outdated
@@ -454,7 +454,7 @@ void CConnman::DumpBanlist() | |||
|
|||
void CNode::CloseSocketDisconnect() | |||
{ | |||
fDisconnect = true; | |||
Disconnect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this could be a scripted diff, s/fDisconnect = true/Disconnect()/g
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably so, will give it a shot.
src/net.cpp
Outdated
@@ -914,12 +914,7 @@ size_t CConnman::SocketSendData(CNode *pnode) const | |||
while (it != pnode->vSendMsg.end()) { | |||
const auto &data = *it; | |||
assert(data.size() > pnode->nSendOffset); | |||
int nBytes = 0; | |||
{ | |||
if (pnode->hSocket == INVALID_SOCKET) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is no longer needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe all of these are gone by the end of the change set, the removal may make more sense in an earlier commit, though.
CloseSocket(hSocket); | ||
return; | ||
const char* method = event_base_get_method(m_event_base); | ||
if (!method || (strncmp(method, "select", 6) == 0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, I'd have hoped that this restriction wouldn't be needed anymore with libevent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only needed if select ends up being used, which should be very unlikely on most platforms. I'll add a comment here, and print something to the log so that it's clear which back-end is being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just bail out if select is being used. But not sure what platforms would be affected.
src/net.cpp
Outdated
@@ -2397,6 +2370,9 @@ bool CConnman::Start(CScheduler& scheduler, const Options& connOptions) | |||
LOCK(m_cs_interrupt_event); | |||
m_interrupt_event = event_new(m_event_base, -1, 0, [](evutil_socket_t, short, void* data) {static_cast<CConnman*>(data)->InterruptEvents();}, (this)); | |||
} | |||
m_event_timeout = timeval{60, 0}; | |||
const timeval* common_timeout = event_base_init_common_timeout(m_event_base, &m_event_timeout); | |||
memcpy(&m_event_timeout, common_timeout, sizeof(m_event_timeout)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation on event_base_init_common_timeout
says "To do this, call this function with the common duration. It will return a pointer to a different, opaque timeout value. (Don't depend on its actual contents!)".
I'm not sure whether that means you need to use the exact pointer returned, or just a copy of its value (as this code does).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good eye! I wasn't sure how to store this properly either. This usage is taken from the official documentation. I'll add a comment about its sanity and a link to the docs.
src/net.cpp
Outdated
@@ -456,7 +471,14 @@ void CConnman::DumpBanlist() | |||
|
|||
void CNode::CloseSocketDisconnect() | |||
{ | |||
Disconnect(); | |||
if (m_write_event) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer it if you could have some RAII wrappers around the libevent structures before adding all this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, see PR description :)
Concept ACK, woohoo! |
src/net.cpp
Outdated
|
||
} | ||
WakeMessageHandler(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be moved into the above if
block? (It used to be inside the if (notify)
statement)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thank you.
src/net.cpp
Outdated
@@ -1150,7 +1150,7 @@ void CConnman::AcceptConnection(const ListenSocket& hListenSocket) { | |||
} | |||
} | |||
|
|||
bool whitelisted = hListenSocket.whitelisted || IsWhitelistedRange(addr); | |||
whitelisted = whitelisted || IsWhitelistedRange(addr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
||=
I don't think such an operator exists in C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, you're right. Should have checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can say for certain that the operator doesn't exist because I tried it first :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hehe I also remember it doesn't exist - I looked it up a long time ago and IIRC the rationale is that's because |=
is the same when using only booleans. In any case this is ok as it is.
7274a18
to
8df8587
Compare
8df8587
to
da4312d
Compare
… closing the socket immediately Messing with the socket from a non-network thread adds unnecessary complexity.
For now, this just sets fDisconnect like before. But a function will be necessary later.
-BEGIN VERIFY SCRIPT- sed -i 's/fDisconnect = true/Disconnect()/' src/net.cpp src/net_processing.cpp -END VERIFY SCRIPT-
…igger it for now this just disables fPauseRecv, but it will need to be a function later.
The socket lock only existed because it was possible to trigger an immediate disconnect via threads other than the socket handler. The optimistic send was the last offender.
Subsequent changes will turn AcceptConnection into a callback where the socket is one of the supplied params.
For now this just handles read/write/error/timeout conditions for each node serially, but OnEvents will soon be invoked as a callback when socket status changes.
By only allowing the socket to be closed and the handle to be invalidated in one place, it becomes much easier to reason about the status at any given time.
This ensures that outgoing connections don't race with event teardown at shutdown.
da4312d
to
77cee16
Compare
Needs rebase. |
3830b6e net: use CreateSocket for binds (Cory Fields) df3bcf8 net: pass socket closing responsibility up to caller for outgoing connections (Cory Fields) 9e3b2f5 net: Move IsSelectableSocket check into socket creation (Cory Fields) 1729c29 net: split socket creation out of connection (Cory Fields) Pull request description: Requirement for #11227. We'll need to create sockets and perform the actual connect in separate steps, so break them up. #11227 adds an RAII wrapper around connection attempts, as a belt-and-suspenders in case a CloseSocket is missed. Tree-SHA512: de675bb718cc56d68893c303b8057ca062c7431eaa17ae7c4829caed119fa3f15b404d8f52aca22a6bca6e73a26fb79e898b335d090ab015bf6456cf417fc694
There hasn't been much activity lately and the patch still needs rebase, so I am closing this for now. Please let me know when you want to continue working on this, so the pull request can be re-opened. |
3830b6e net: use CreateSocket for binds (Cory Fields) df3bcf8 net: pass socket closing responsibility up to caller for outgoing connections (Cory Fields) 9e3b2f5 net: Move IsSelectableSocket check into socket creation (Cory Fields) 1729c29 net: split socket creation out of connection (Cory Fields) Pull request description: Requirement for bitcoin#11227. We'll need to create sockets and perform the actual connect in separate steps, so break them up. bitcoin#11227 adds an RAII wrapper around connection attempts, as a belt-and-suspenders in case a CloseSocket is missed. Tree-SHA512: de675bb718cc56d68893c303b8057ca062c7431eaa17ae7c4829caed119fa3f15b404d8f52aca22a6bca6e73a26fb79e898b335d090ab015bf6456cf417fc694
3830b6e net: use CreateSocket for binds (Cory Fields) df3bcf8 net: pass socket closing responsibility up to caller for outgoing connections (Cory Fields) 9e3b2f5 net: Move IsSelectableSocket check into socket creation (Cory Fields) 1729c29 net: split socket creation out of connection (Cory Fields) Pull request description: Requirement for bitcoin#11227. We'll need to create sockets and perform the actual connect in separate steps, so break them up. bitcoin#11227 adds an RAII wrapper around connection attempts, as a belt-and-suspenders in case a CloseSocket is missed. Tree-SHA512: de675bb718cc56d68893c303b8057ca062c7431eaa17ae7c4829caed119fa3f15b404d8f52aca22a6bca6e73a26fb79e898b335d090ab015bf6456cf417fc694
Removing "Up for grabs", given we are now in the process of trying to remove libevent. |
Not yet ready for review. This can be considered a staging area. Chunks of this will be PR'd until only the actual libevent switch-over is ready, at which point this PR should be ready for review.
These changes remove our old select() loop for socket handling in favor of libevent, which uses epoll/kqueue/etc. as a back-end. In addition to being faster and more efficient, this allows us to drop some annoying restrictions, namely that select can only handle 1024 sockets in practice on most systems.
Note that this does not yet make the switch to libevent for outgoing connections, that work is happening in parallel, and will be easier to merge after this.
Also, for any reviewers, several of these commits would individually introduce some regressions or slow-downs, but they've been split up in order to clarify why some of the changes are being made.
Depends on:
Todo: