Skip to content

Conversation

theuni
Copy link
Member

@theuni theuni commented Sep 3, 2017

Not yet ready for review. This can be considered a staging area. Chunks of this will be PR'd until only the actual libevent switch-over is ready, at which point this PR should be ready for review.

These changes remove our old select() loop for socket handling in favor of libevent, which uses epoll/kqueue/etc. as a back-end. In addition to being faster and more efficient, this allows us to drop some annoying restrictions, namely that select can only handle 1024 sockets in practice on most systems.

Note that this does not yet make the switch to libevent for outgoing connections, that work is happening in parallel, and will be easier to merge after this.

Also, for any reviewers, several of these commits would individually introduce some regressions or slow-downs, but they've been split up in order to clarify why some of the changes are being made.

Depends on:

Todo:

  • Add a ton of documentation
  • RAII the libevent structures
  • Add some tests where possible

@theuni
Copy link
Member Author

theuni commented Sep 4, 2017

I believe that the proxy_test failure is a false-positive.

Looks like the issue is that disconnection happens so quickly now after an error that the testnode fails.

src/net.cpp Outdated
@@ -454,7 +454,7 @@ void CConnman::DumpBanlist()

void CNode::CloseSocketDisconnect()
{
fDisconnect = true;
Disconnect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this could be a scripted diff, s/fDisconnect = true/Disconnect()/g.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably so, will give it a shot.

src/net.cpp Outdated
@@ -914,12 +914,7 @@ size_t CConnman::SocketSendData(CNode *pnode) const
while (it != pnode->vSendMsg.end()) {
const auto &data = *it;
assert(data.size() > pnode->nSendOffset);
int nBytes = 0;
{
if (pnode->hSocket == INVALID_SOCKET)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is no longer needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe all of these are gone by the end of the change set, the removal may make more sense in an earlier commit, though.

CloseSocket(hSocket);
return;
const char* method = event_base_get_method(m_event_base);
if (!method || (strncmp(method, "select", 6) == 0)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I'd have hoped that this restriction wouldn't be needed anymore with libevent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only needed if select ends up being used, which should be very unlikely on most platforms. I'll add a comment here, and print something to the log so that it's clear which back-end is being used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just bail out if select is being used. But not sure what platforms would be affected.

src/net.cpp Outdated
@@ -2397,6 +2370,9 @@ bool CConnman::Start(CScheduler& scheduler, const Options& connOptions)
LOCK(m_cs_interrupt_event);
m_interrupt_event = event_new(m_event_base, -1, 0, [](evutil_socket_t, short, void* data) {static_cast<CConnman*>(data)->InterruptEvents();}, (this));
}
m_event_timeout = timeval{60, 0};
const timeval* common_timeout = event_base_init_common_timeout(m_event_base, &m_event_timeout);
memcpy(&m_event_timeout, common_timeout, sizeof(m_event_timeout));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation on event_base_init_common_timeout says "To do this, call this function with the common duration. It will return a pointer to a different, opaque timeout value. (Don't depend on its actual contents!)".

I'm not sure whether that means you need to use the exact pointer returned, or just a copy of its value (as this code does).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eye! I wasn't sure how to store this properly either. This usage is taken from the official documentation. I'll add a comment about its sanity and a link to the docs.

src/net.cpp Outdated
@@ -456,7 +471,14 @@ void CConnman::DumpBanlist()

void CNode::CloseSocketDisconnect()
{
Disconnect();
if (m_write_event) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer it if you could have some RAII wrappers around the libevent structures before adding all this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, see PR description :)

@laanwj
Copy link
Member

laanwj commented Sep 5, 2017

Concept ACK, woohoo!

src/net.cpp Outdated

}
WakeMessageHandler();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be moved into the above if block? (It used to be inside the if (notify) statement)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you.

src/net.cpp Outdated
@@ -1150,7 +1150,7 @@ void CConnman::AcceptConnection(const ListenSocket& hListenSocket) {
}
}

bool whitelisted = hListenSocket.whitelisted || IsWhitelistedRange(addr);
whitelisted = whitelisted || IsWhitelistedRange(addr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

||=

I don't think such an operator exists in C++.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, you're right. Should have checked.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can say for certain that the operator doesn't exist because I tried it first :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehe I also remember it doesn't exist - I looked it up a long time ago and IIRC the rationale is that's because |= is the same when using only booleans. In any case this is ok as it is.

… closing the socket immediately

Messing with the socket from a non-network thread adds unnecessary complexity.
For now, this just sets fDisconnect like before. But a function will be
necessary later.
-BEGIN VERIFY SCRIPT-
sed -i 's/fDisconnect = true/Disconnect()/' src/net.cpp src/net_processing.cpp
-END VERIFY SCRIPT-
…igger it

for now this just disables fPauseRecv, but it will need to be a function later.
The socket lock only existed because it was possible to trigger an immediate
disconnect via threads other than the socket handler.

The optimistic send was the last offender.
Subsequent changes will turn AcceptConnection into a callback where the socket
is one of the supplied params.
For now this just handles read/write/error/timeout conditions for each node
serially, but OnEvents will soon be invoked as a callback when socket status
changes.
By only allowing the socket to be closed and the handle to be invalidated in
one place, it becomes much easier to reason about the status at any given time.
This ensures that outgoing connections don't race with event teardown at
shutdown.
@theuni theuni changed the title RFC: switch to libevent for node socket handling WIP: switch to libevent for node socket handling Sep 20, 2017
@promag
Copy link
Contributor

promag commented Oct 18, 2017

Needs rebase.

laanwj added a commit that referenced this pull request Dec 13, 2017
3830b6e net: use CreateSocket for binds (Cory Fields)
df3bcf8 net: pass socket closing responsibility up to caller for outgoing connections (Cory Fields)
9e3b2f5 net: Move IsSelectableSocket check into socket creation (Cory Fields)
1729c29 net: split socket creation out of connection (Cory Fields)

Pull request description:

  Requirement for #11227.

  We'll need to create sockets and perform the actual connect in separate steps, so break them up.

  #11227 adds an RAII wrapper around connection attempts, as a belt-and-suspenders in case a CloseSocket is missed.

Tree-SHA512: de675bb718cc56d68893c303b8057ca062c7431eaa17ae7c4829caed119fa3f15b404d8f52aca22a6bca6e73a26fb79e898b335d090ab015bf6456cf417fc694
@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 8, 2018

There hasn't been much activity lately and the patch still needs rebase, so I am closing this for now. Please let me know when you want to continue working on this, so the pull request can be re-opened.

@DrahtBot DrahtBot closed this Nov 8, 2018
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jan 25, 2020
3830b6e net: use CreateSocket for binds (Cory Fields)
df3bcf8 net: pass socket closing responsibility up to caller for outgoing connections (Cory Fields)
9e3b2f5 net: Move IsSelectableSocket check into socket creation (Cory Fields)
1729c29 net: split socket creation out of connection (Cory Fields)

Pull request description:

  Requirement for bitcoin#11227.

  We'll need to create sockets and perform the actual connect in separate steps, so break them up.

  bitcoin#11227 adds an RAII wrapper around connection attempts, as a belt-and-suspenders in case a CloseSocket is missed.

Tree-SHA512: de675bb718cc56d68893c303b8057ca062c7431eaa17ae7c4829caed119fa3f15b404d8f52aca22a6bca6e73a26fb79e898b335d090ab015bf6456cf417fc694
gades pushed a commit to cosanta/cosanta-core that referenced this pull request Jun 25, 2021
3830b6e net: use CreateSocket for binds (Cory Fields)
df3bcf8 net: pass socket closing responsibility up to caller for outgoing connections (Cory Fields)
9e3b2f5 net: Move IsSelectableSocket check into socket creation (Cory Fields)
1729c29 net: split socket creation out of connection (Cory Fields)

Pull request description:

  Requirement for bitcoin#11227.

  We'll need to create sockets and perform the actual connect in separate steps, so break them up.

  bitcoin#11227 adds an RAII wrapper around connection attempts, as a belt-and-suspenders in case a CloseSocket is missed.

Tree-SHA512: de675bb718cc56d68893c303b8057ca062c7431eaa17ae7c4829caed119fa3f15b404d8f52aca22a6bca6e73a26fb79e898b335d090ab015bf6456cf417fc694
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Dec 16, 2021
@fanquake
Copy link
Member

Removing "Up for grabs", given we are now in the process of trying to remove libevent.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants