Skip to content

Conversation

kwvg
Copy link
Collaborator

@kwvg kwvg commented May 4, 2024

Motivation

CConnman is an entity that contains a lot of platform-specific implementation logic, both inherited from upstream and added upon by Dash (support for edge-triggered socket events modes like epoll on Linux and kqueue on FreeBSD/Darwin).

Bitcoin has since moved to strip down CConnman by moving peer-related logic to the Peer struct in net_processing (portions of which are backported in #5982 and friends, tracking efforts from bitcoin#19398) and moving socket-related logic to Sock (portions of which are aimed to be backported in #6004, tracking efforts from bitcoin#21878).

Due to the direction being taken and the difference in how edge-triggered events modes operate (utilizing interest lists and events instead of iterating over each socket) in comparison to level-triggered modes (which are inherited from upstream), it would be reasonable to therefore, isolate Dash-specific code into its own entities and minimize the information CConnman has about its internal workings.

One of the visible benefits of this approach is comparing develop (as of this writing, d44b0d5) and this pull request for interactions between wakeup pipes logic and {epoll, kqueue} logic.

This is what construction looks like:

dash/src/net.cpp

Lines 3358 to 3397 in d44b0d5

#ifdef USE_WAKEUP_PIPE
if (pipe(wakeupPipe) != 0) {
wakeupPipe[0] = wakeupPipe[1] = -1;
LogPrint(BCLog::NET, "pipe() for wakeupPipe failed\n");
} else {
int fFlags = fcntl(wakeupPipe[0], F_GETFL, 0);
if (fcntl(wakeupPipe[0], F_SETFL, fFlags | O_NONBLOCK) == -1) {
LogPrint(BCLog::NET, "fcntl for O_NONBLOCK on wakeupPipe failed\n");
}
fFlags = fcntl(wakeupPipe[1], F_GETFL, 0);
if (fcntl(wakeupPipe[1], F_SETFL, fFlags | O_NONBLOCK) == -1) {
LogPrint(BCLog::NET, "fcntl for O_NONBLOCK on wakeupPipe failed\n");
}
#ifdef USE_KQUEUE
if (socketEventsMode == SOCKETEVENTS_KQUEUE) {
struct kevent event;
EV_SET(&event, wakeupPipe[0], EVFILT_READ, EV_ADD, 0, 0, nullptr);
int r = kevent(kqueuefd, &event, 1, nullptr, 0, nullptr);
if (r != 0) {
LogPrint(BCLog::NET, "%s -- kevent(%d, %d, %d, ...) failed. error: %s\n", __func__,
kqueuefd, EV_ADD, wakeupPipe[0], NetworkErrorString(WSAGetLastError()));
return false;
}
}
#endif
#ifdef USE_EPOLL
if (socketEventsMode == SOCKETEVENTS_EPOLL) {
epoll_event event;
event.events = EPOLLIN;
event.data.fd = wakeupPipe[0];
int r = epoll_ctl(epollfd, EPOLL_CTL_ADD, wakeupPipe[0], &event);
if (r != 0) {
LogPrint(BCLog::NET, "%s -- epoll_ctl(%d, %d, %d, ...) failed. error: %s\n", __func__,
epollfd, EPOLL_CTL_ADD, wakeupPipe[0], NetworkErrorString(WSAGetLastError()));
return false;
}
}
#endif
}
#endif

But, if we segment wakeup pipes logic (that work on any platform with POSIX APIs and excludes Windows) and {epoll, kqueue} logic (calling them EdgeTriggeredEvents instead), construction looks different:

dash/src/util/wpipe.cpp

Lines 12 to 38 in 907a351

WakeupPipe::WakeupPipe(EdgeTriggeredEvents* edge_trig_events)
: m_edge_trig_events{edge_trig_events}
{
#ifdef USE_WAKEUP_PIPE
if (pipe(m_pipe.data()) != 0) {
LogPrintf("Unable to initialize WakeupPipe, pipe() for m_pipe failed with error %s\n",
NetworkErrorString(WSAGetLastError()));
return;
}
for (size_t idx = 0; idx < m_pipe.size(); idx++) {
int flags = fcntl(m_pipe[idx], F_GETFL, 0);
if (fcntl(m_pipe[idx], F_SETFL, flags | O_NONBLOCK) == -1) {
LogPrintf("Unable to initialize WakeupPipe, fcntl for O_NONBLOCK on m_pipe[%d] failed with error %s\n", idx,
NetworkErrorString(WSAGetLastError()));
return;
}
}
if (edge_trig_events && !edge_trig_events->RegisterPipe(m_pipe[0])) {
LogPrintf("Unable to initialize WakeupPipe, EdgeTriggeredEvents::RegisterPipe() failed for m_pipe[0] = %d\n",
m_pipe[0]);
return;
}
m_valid = true;
#else
LogPrintf("Attempting to initialize WakeupPipe without support compiled in!\n");
#endif /* USE_WAKEUP_PIPE */
}

Now wakeup pipes logic doesn't need to know what socket events mode is being used nor are the implementation aspects of (de)registering it its concern, that is now EdgeTriggeredEvents problem.

Additional Information

  • This pull request will need testing on macOS (FreeBSD isn't a tier-one target) to ensure that lack of breakage in kqueue-specific logic.

Breaking Changes

  • Dependency for backport: merge bitcoin#24356 (replace CConnman::SocketEvents() with mockable Sock::WaitMany()), implement Sock::WaitMany{Epoll,KQueue}() #6018

  • More logging has been introduced and existing log messages have been made more exhaustive. If there is parsing that relies on a particular template, they will have to be updated.

  • If EdgeTriggeredEvents or WakeupPipes fail to initialize or are incorrectly initialized and not destroyed immediately, any further attempts at calling any of its functions will result in an assert-induced crash. Earlier behavior may have allowed for silent failure but segmentation of logic from CConnman means the newly created instances must only exist if the circumstances needed for it to initialize correctly are present.

    This is to ensure that CConnman doesn't have to concern itself with internal workings of either entities.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests (note: N/A)
  • I have made corresponding changes to the documentation (note: N/A)
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

@MrDefacto
Copy link

I can do FreeBSD testing. I know that there are a couple hundred masternodes running on FreeBSD. I don't think it's unimportant.

@kwvg kwvg force-pushed the refac_net branch 2 times, most recently from a30ab1a to 907a351 Compare May 5, 2024 08:04
@UdjinM6
Copy link

UdjinM6 commented May 6, 2024

works with no issues on mac (kqueue mode) it seems, a couple of suggestions: e0deb9e and maybe 4e7f80e

@kwvg
Copy link
Collaborator Author

kwvg commented May 7, 2024

@MrDefacto Messaged you on Keybase!

@kwvg
Copy link
Collaborator Author

kwvg commented May 9, 2024

a couple of suggestions: e0deb9e and maybe 4e7f80e

With regards to e0deb9e , I was originally planning to move SocketEventsMode to util/sock.h in net_processing_6 (see kwvg@ed1409d) but getting rid of EdgeEventsMode and the need to sync enums was a good enough to reason to cherry-pick it. Thanks for the suggestion!

As for 4e7f80e, I'm leaning in the opposite direction and attempting to limit CConnman's knowledge of the macro (this PR still has a few uses of USE_{EPOLL, KQUEUE, POLL} that is done away with in the successor PR). The latest push (d5f0838) removes a few more USE_WAKEUP_PIPE usage and eventually am hoping to contain it to util/wpipes.cpp entirely.


Also, a condition inversion in WakeupPipe::Drain() has been resolved in the latest push (for some reason the infinite loop introduced didn't manifest in this PR but did in net_processing_6).

Copy link

This pull request has conflicts, please rebase.

@kwvg kwvg force-pushed the refac_net branch 2 times, most recently from d204699 to 2c978cc Compare May 10, 2024 18:52
@kwvg kwvg marked this pull request as ready for review May 10, 2024 21:53
@kwvg kwvg requested review from knst, UdjinM6 and PastaPastaPasta May 10, 2024 21:53
@kwvg kwvg marked this pull request as draft May 10, 2024 21:55
@kwvg kwvg requested review from knst, UdjinM6 and PastaPastaPasta and removed request for knst, UdjinM6 and PastaPastaPasta May 10, 2024 21:56
@kwvg kwvg marked this pull request as ready for review May 12, 2024 10:49
@UdjinM6 UdjinM6 added this to the 21 milestone May 12, 2024
UdjinM6
UdjinM6 previously approved these changes May 13, 2024
Copy link

@UdjinM6 UdjinM6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, light ACK (on mac)

return;
#endif /* USE_KQUEUE */
} else {
assert(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use something like __builtin_unreachable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__builtin_unreachable is a compiler hint to optimize it out but is within the realm of possibility that the path might be taken due to erroneous code. As it is not an assertion that the code path isn't taken, assert seemed more suitable for the job.

#if defined(USE_KQUEUE) || defined(USE_EPOLL)
close(m_fd);
#else
assert(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use something like __builtin_unreachable?

Copy link
Member

@PastaPastaPasta PastaPastaPasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK bd8b5d4

@PastaPastaPasta PastaPastaPasta closed this pull request by merging all changes into dashpay:develop in 3b0323a May 14, 2024
@UdjinM6 UdjinM6 mentioned this pull request May 17, 2024
5 tasks
PastaPastaPasta added a commit that referenced this pull request May 18, 2024
113b3fe fix: actually use `-socketevents` (UdjinM6)

Pull request description:

  ## Issue being fixed or feature implemented
  #6007 follow-up

  ## What was done?

  ## How Has This Been Tested?
  check `socketevents` in `getnetworkinfo` response

  ## Breaking Changes
  n/a

  ## Checklist:
  - [x] I have performed a self-review of my own code
  - [ ] I have commented my code, particularly in hard-to-understand areas
  - [ ] I have added or updated relevant unit/integration/functional/e2e tests
  - [ ] I have made corresponding changes to the documentation
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  kwvg:
    ACK 113b3fe
  PastaPastaPasta:
    utACK 113b3fe

Tree-SHA512: 50dcbdfe1f34e42e71078b585cfed2cd6b07f5f08c8296c7205367043e42e676c2eca47fa5193fdb9071eef202b01ba6e44ae2e3affb59a4e94196ecb6eb4350
PastaPastaPasta added a commit that referenced this pull request Jul 9, 2025
…Events()` with mockable `Sock::WaitMany()`), implement `Sock::WaitMany{Epoll,KQueue}()`

c611fb0 fix: set `g_socket_events_mode` before starting `CConnman` (UdjinM6)
c6e0e96 chore: remove scaffolding (remove default args, make explicit choice) (Kittywhiskers Van Gogh)
aca5ec9 chore: remove scaffolding (SEM must be correct, no graceful fallback) (Kittywhiskers Van Gogh)
08a42c1 refactor: move `DEFAULT_SOCKETEVENTS` to `util/sock.h` (Kittywhiskers Van Gogh)
e4cc5ac net: implement `ToggleWakeupPipe` in all WaitMany variants (Kittywhiskers Van Gogh)
f01a871 net: add early bail out condition for empty `events_per_sock` for LTMs (Kittywhiskers Van Gogh)
5ae6f2a fix: merge `kqueue` events manually as they are not bitwise OR'ed (Kittywhiskers Van Gogh)
0d92d40 net: implement `WaitMany` variants for {`epoll`, `kqueue`} (Kittywhiskers Van Gogh)
0a8b8a6 merge bitcoin#24356: replace CConnman::SocketEvents() with mockable Sock::WaitMany() (Kittywhiskers Van Gogh)
4a7114f refactor: clean up `CConnman::SocketWait{Epoll,Kqueue}()` logic (Kittywhiskers Van Gogh)
a33f88f net: reintroduce `IsSelectableSocket()` and make it SEM-aware (Kittywhiskers Van Gogh)
41eaed2 net: allow selection of `Wait()` API by specifying `SocketEventsMode` (Kittywhiskers Van Gogh)
ca1ec0b net: split out `poll` and `select` variants from `Sock::Wait()` (Kittywhiskers Van Gogh)
7cffc0b fix: drain before winding down `WakeupPipe` object to avoid `SIGPIPE` (Kittywhiskers Van Gogh)
b69c1a1 fix: avoid dangling pipes during failed `WakeupPipe` initialization (Kittywhiskers Van Gogh)

Pull request description:

  ## Additional information

  * Dependent on #6004
  * Dependent on #6007
  * Dependent on #6027
  * Deviations from upstream

      | Bitcoin                                                      | Dash                                                         | Reason                                                       |
      | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
      | `EventsPerSock` is a unordered map of `shared_ptr`s of `Sock` **wrappers** and `Events` | `EventsPerSock` is an unordered map of **raw** socket file descriptors (`SOCKET`) and `Events` | Dash implements `WakeupPipes`, which is constructed and destroyed using an entity outside `Sock`'s control. We need to be able to insert the read pipe raw socket into equivalent of the `recv` socket set and query for it later on.<br /><br />It would be _technically_ possible, though cumbersome, to wrap the read pipe raw socket in a `Sock` and overwrite the destructor if it wasn't for the support of edge-triggered modes which have an event-socket relationship, as opposed to level triggered modes, that have a socket-event relationship. |
      | Sockets passed in an `EventsPerSock` map will **always** return with event data for every corresponding entry. | Sockets passed in an  `EventsPerSock` map **may** return with event data for its corresponding entry. | The behaviour defined for Bitcoin will also be presented in Dash **if** the socket events mode (SEM) is `poll` or `select`. Otherwise, it will behave as described.<br /><br />This is due to the inversion of the socket-event relationship in edge-triggered modes (`epoll` and `kqueue`), as alluded to earlier. As edge-triggered modes return events and their corresponding socket (sockets registered through `EdgeTriggeredEvents::RegisterEntity()` and friends), the `EventsPerSock` map, will **have its contents completely discarded** and substituted with the results of {`epoll`, `kqueue`}. |
      | You **must** have a `Sock` entity to call `Sock::WaitMany()` | You can directly access `Sock::WaitMany()`'s underlying logic through calling `Sock::WaitManyInternal()` (and access any *specific* event mode's implementation) **without** a `Sock` entity. | This change has been made as Bitcoin's behaviour was to call `WaitMany` by seeking to the first element to access it. This was possible because the unordered map consisted of `Sock` entities. As that isn't the case for Dash and `WaitMany` doesn't truly rely on instance-specific member values of a particular `Sock` instance (the values it relies on should remain constant throughout program runtime), it can be safely made a `static` function and that was exactly what was done.<br /><br />It has been named `WaitManyInternal()` as one of `Sock`'s purposes is mockability and `WaitMany()` (simply a passthrough to `WaitManyInternal()`) has been defined as a `virtual` function. |
      | `Sock`'s usage of platform-specific APIs is *decided* exclusively at **compile-time**. | `Sock`'s usage of platform-specific APIs is determined by what is *supported* at compile-time and *decided* at **runtime** (mostly). | `Sock::Wait()` (which is transformed into `Sock::WaitMany()` in this pull request) supported only `poll` and `select` and behaved as described for Bitcoin.<br /><br />The described behaviour for Dash was only applicable for `CConnman::SocketEvents()`. But, as `SocketEvents()` is being replaced wholesale with `WaitMany()`, `WaitMany()` needed to be adapted to mirror `SocketEvents()` behaviour.<br /><br />This has resulted in changes that now require knowledge of the expected runtime SEM and file descriptor (if using an edge-triggered mode). |
      | `Sock::Wait()` and `Sock::WaitMany()` behave **identically** | `Sock::Wait()` will respect the SEM selection argument **if** it is level-triggered **but** will fallback to `poll` or `select` (determined at compile-time) **if** the SEM selection is edge-triggered. | Due to the event-socket relationship of edge-triggered modes, they are unsuitable for querying the state of a *particular* socket.<br /><br />Because of that and **a)** the unlikelihood of the socket probed being registered with `EdgeTriggeredEvents::RegisterEntity()` and **b)** the overhead involved in fetching a list, filtering out for the particular socket we care about and flagging the result, it is more practical to use an LT-SEM instead. |

  ## How Has This Been Tested?

  Correctness of `socket`, `poll` and `epoll` SEMs were tested using a Debian 12 (`bookworm`) Docker container with additional logging to ensure the correct syscalls were being made. Correctness of the `kqueue` SEM was tested using a GhostBSD 23.10.1 (based on FreeBSD 13.2-STABLE) virtual machine with similar additional logging.

  ## Breaking Changes

  None expected. Behaviour should remain unchanged.

  ## Checklist
    _Go over all the following points, and put an `x` in all the boxes that apply._
  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas
  - [x] I have added or updated relevant unit/integration/functional/e2e tests
  - [x] I have made corresponding changes to the documentation **(note: N/A)**
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  UdjinM6:
    re-utACK c611fb0
  PastaPastaPasta:
    utACK c611fb0

Tree-SHA512: 5daf093eafca94f4a3aad0ed4ee8b3d153c270b45294ef15c6b95bd83209a9bbc2212f88d1fe43c370b3e744e529c654c9530d7c0d7a0398bc0c3967fb362e5a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants