Add "prepare" and "check" watchers #793

mergeconflict · 2019-03-26T17:38:42Z

Adds two new callbacks: "prepare" watchers, which fire immediately before we poll for I/O, and "check" watchers, which fire immediately after we finish polling and before we process events. This allows other event loops to be embedded into libevent's, and enables certain performance monitoring.

Closes: #710

mergeconflict

Some notes and questions inline for reviewers. Thanks in advance!

CMakeLists.txt

evwatcher.c

include/event2/event_watcher.h

mergeconflict · 2019-03-26T17:51:47Z

cc @htuch

coveralls · 2019-03-26T18:04:32Z

Coverage increased (+0.06%) to 81.326% when pulling 2f184f8 on mergeconflict:prepare_and_check_watchers into 47d348a on libevent:master.

evwatcher.c

azat

So this looks good, but requires some modifications:

comments
windows build is broken
code and declaration mixes
the test fails on ci

test-prepare-check: /home/travis/build/libevent/libevent/test/test-prepare-check.c:73: prepare_callback_1: Assertion `info->timeout->tv_sec == 1' failed.

can you relax this check? i.e. allow small difference like 50ms or so (take a look at test_timeval_diff_eq)

also maybe it is better to move this into regress_watchers.c to use tinytest asserts there (to know the expected/actual values)

Also it will be great to have some good example (under sample/), like the histograms that had been mentioned in the #710 (and then you refer to it from the patches to the book)

.gitignore

event-internal.h

event.c

evwatcher.c

include/event2/event_watcher.h

test/test-prepare-check.c

mergeconflict · 2019-03-27T19:21:45Z

@azat I still haven't added the example under sample/ you asked for -- I'll work on that next...

event.c

azat

Comments after a brief look in the review (I will take a closer look before merge)

test/regress_watcher.c

event-internal.h

include/event2/event_watcher.h

azat · 2019-03-28T14:46:39Z

Thinking a bit about overhead... I'm wondering if we should gate this new feature behind a compilation flag, like:

The overhead should be neglible (one more lock/unlock, BTW we can get rid off the impact by checking if there are any watchers and if not we can skip lock/foreack/unlock, but again I don't think that even this simple condition worth it) since event_base_loop() should not be called tons of times (by those who cares about performance). Also introducing yet another compilation option increases coverage matrix (and it is already huge enough).

azat · 2019-03-28T14:52:51Z

Hmmm. I don't feel strongly about this, but let me argue a little bit: I was thinking that the `const` provides enough safety -- if somebody casts away the `const`, it's their own fault. And maybe it's not actually a bad thing for users to modify the timeout? I don't know :)

I'm not sure that we should allow changing this time, and even if we should, then explicitly via some API, like event_prepare_set_{loop/base}_timeout() not via hackish casts (personally I, don't care about architectures other then x86/amd64, but this const-cast-away can break something for them). But maybe I'm too suspicious about this...

int event_prepare_get_timeout(const struct event_prepare_cb_info *info, struct timeval *timeout) { if (info->timeout) { *timeout = *(info->timeout); return 1; } return 0; }

Looks good.

azat

So everything looks good!

Can you do the final bits then:

squash all of them into one
write description for the commit
add entry to the whatsnews-2.2.txt

Thanks for rapid reaction!

event.c

mergeconflict · 2019-03-29T02:11:36Z

Ok... I've added an example of how we might use prepare and check watchers to collect timing statistics. There's a fun implementation of constant-space histograms in there... it might be nice to turn that into a proper library at some point. @htuch, @mattklein123 - maybe you'd be interested in having a look at the example as well.

I haven't finished addressing all the style comments yet. I'll make another pass tomorrow.

mergeconflict · 2019-03-29T15:33:51Z

Hi @azat, thanks for all your feedback! I think I've addressed all the style issues, I've added a "what's new" entry, and I've updated the issue description.

I just want to double-check -- you want me to git rebase -i locally to squash all the commits, using this issue description as the commit message, and then force-push the branch -- is that right?

azat · 2019-03-29T15:46:32Z

I think I've addressed all the style issues, I've added a "what's new" entry, and I've updated the issue description.

Thanks! Will take a closer look today or at least tomorrow

I just want to double-check -- you want me to git rebase -i locally to squash all the commits, using this issue description as the commit message, and then force-push the branch -- is that right?

Yep git rebase locally, squash all the commits, update commit description and git push --force

azat

Please take a look at the comments, thanks!

event.c

sample/watcher-timing.c

evwatcher.c

sample/watcher-timing.c

evwatcher.c

sample/watcher-timing.c

azat · 2019-04-01T15:14:59Z

+#pragma GCC diagnostic ignored "-Wfloat-equal" I get the following warning without it: sample/watcher-timing.c:43:11: warning: comparing floating point with == or != is unsafe [-Wfloat-equal] return a == b; ~ ^ ~

Got it, thanks! (yet another difference between autotools and cmake, will fix this later).

mergeconflict · 2019-04-01T17:15:05Z

@azat updated and rebased. The lock/unlock was a good catch. Thanks again for the thorough review!

azat

Thanks for the quick reaction!
There some minor issues with the sample but other then this everything looks pretty good!

sample/watch-timing.c

mergeconflict · 2019-04-02T19:24:37Z

Hi @azat, everything's updated. Let me know if there's anything else!

azat · 2019-04-02T19:34:13Z

@mergeconflict everything looks good!

And while thinking about this one more time, I think that user-passed argument will not be superfluous, i.e. evwatch_prepare_new(struct event_base *base, evwatch_prepare_cb callback, void *arg), can you add this? (and I think that this is the last bit, sorry for not noticing this before)

mergeconflict · 2019-04-02T19:41:43Z

And while thinking about this one more time, I think that user-passed argument will not be superfluous, i.e. evwatch_prepare_new(struct event_base *base, evwatch_prepare_cb callback, void *arg), can you add this? (and I think that this is the last bit, sorry for not noticing this before)

Sure, no problem. I was also actually thinking it might be handy. I'll put the *arg as an additional argument in the callbacks, after the evwatch_*_cb_info to stay consistent with how the evbuffer callbacks are designed.

azat · 2019-04-02T20:00:33Z

So everything looks great, will apply this once the CI will be passed (or almost passed)
Thanks for the quick reaction!

azat · 2019-04-02T20:11:20Z

One more thing pops up evwatch_check_cb_info is empty, and while this is accepted by gcc it is not strictly valid and VS do not pass this:

c:\projects\libevent\event-internal.h(221): error C2016: C requires that a struct or union has at least one member [C:\projects\libevent\build-cmake\event_core_shared.vcxproj]

Maybe this can be suppressed with some kind of pragma, but I think that better option will be to add one field there instead, like the time the dispatch took

azat · 2019-04-02T20:24:49Z

And commit message should be formatted appropriate, i.e.:

subject
<   blank line   >
body

since otherwise git will think that the whole commit message is the subject (without empty line)

(first link from google https://chris.beams.io/posts/git-commit/#seven-rules)

mergeconflict · 2019-04-02T21:01:24Z

Maybe this can be suppressed with some kind of pragma, but I think that better option will be to add one field there instead, like the time the dispatch took

Interesting, that's actually very helpful for us -- it would mean that we don't have to save the prepare and check timestamps externally. I'm thinking to rename the prepare timeout expected_timeout and call the check timeout actual_timeout, so that it's obvious to users that you can compare the actual and expected values. What do you think?

azat · 2019-04-02T21:20:53Z

Interesting, that's actually very helpful for us -- it would mean that we don't have to save the prepare and check timestamps externally. I'm thinking to rename the prepare timeout expected_timeout and call the check timeout actual_timeout, so that it's obvious to users that you can compare the actual and expected values. What do you think?

I like the idea in general, but don't like such long fields. I guess it is fine here

mergeconflict · 2019-04-02T21:57:00Z

Ok... I've implemented timing, but I haven't squashed the commits yet, since I'm not sure if you'll be ok with it. Specifically, to get accurate timing, I've had to add two evutil_gettimeofday calls before and after evsel->dispatch, which might be unacceptable overhead in some cases. I tried using the cached time, but it was too inaccurate (like 3000us compared to about 1000us expected).

Of course I'd be very happy to keep this implementation, since it supports my use case directly, but I understand if it's not good for others. If that's the case, then we'll solve the Win32 empty struct issue by simply adding a dummy unused field instead.

mergeconflict · 2019-04-02T22:12:24Z

Hmm... I just noticed I have a link error on Windows:

LINK : fatal error LNK1104: cannot open file 'm.lib' [C:\projects\libevent\build-cmake\watch-timing.vcxproj]

Trying a fix...

azat · 2019-04-03T05:10:17Z

to get accurate timing, I've had to add two evutil_gettimeofday calls before and after evsel->dispatch, which might be unacceptable overhead in some cases

Well if we are talking about modern linux (but maybe I missed something, since this is 5min experience) then I don't think that cached version is worth it, and just out of curiosity I wrote simple test and this assumption is correct (actually non cached version is a little bit faster)

But we have lots of other supported architectures, so I let's use cached version here and this will be addressed later

(because since we do have this event_base_gettimeofday_cached() now we should use it, if we do not want it, it should be removed completely in one pass)

mergeconflict · 2019-04-03T15:41:24Z

But we have lots of other supported architectures, so I let's use cached version here and this will be addressed later

The issue with using the cached time is that it's too inaccurate, at least for the main use case we care about in Envoy. The goal of measuring the actual vs. expected polling duration is that if you compare the two, the actual duration should be very close to the expected duration - within one kernel timeslice usually. If the delay is larger, it's an indication that your kernel is swamped. Using the cached time will give the misleading impression that something's wrong...

My suggestion is, let's leave this out for now. In the future, if we get rid of the cached time (since your experimental results seem good), then we can put the accurate timing here. Does that sound ok?

azat · 2019-04-03T16:22:25Z

My suggestion is, let's leave this out for now. In the future, if we get rid of the cached time (since your experimental results seem good), then we can put the accurate timing here. Does that sound ok?

Agree

Adds two new callbacks: "prepare" watchers, which fire immediately before we poll for I/O, and "check" watchers, which fire immediately after we finish polling and before we process events. This allows other event loops to be embedded into libevent's, and enables certain performance monitoring. Closes: libevent#710

mergeconflict · 2019-04-03T16:47:36Z

Ok, cool. I've updated:

Revert the timing thing
Fix the empty struct issue by adding a dummy unused field
Fix the git commit message
Squashed and rebased against the latest changes in master

azat · 2019-04-03T17:41:44Z

Applied, thanks for the effort (endless rebases and so forth)!

mergeconflict · 2019-04-03T18:35:20Z

Thanks @azat!

Add per-thread dispatcher statistics for loop duration and poll delay, based on new "prepare" and "check" watchers added in libevent (libevent/libevent#793). See discussion in #4952. Risk Level: medium Testing: Added unit test, all existing tests pass, and running locally yields sane results. Docs Changes: Added a new page on "performance" discussing event loop. Release Notes: Added an entry noting new stats. Signed-off-by: Dan Rosen <mergeconflict@google.com>

@htuch

@htuch discovered a race condition in my libevent watcher implementation in the process of enabling TSAN for dependencies (#6610). Update libevent to pull in the fix (libevent/libevent#793). Risk Level: low Testing: bazel test //test/server:worker_impl_test -c dbg --config=clang-tsan --runs_per_test=1000 (with @htuch's patch applied). Signed-off-by: Dan Rosen <mergeconflict@google.com>

The previous attempt (https://crrev.com/c/2844573) to fix deadlocks when the browser is using Wayland EGL was not successful. While it did fix the deadlock, it brough other problems (see crbug.com/1280424 and crbug.com/1287664). It was decided to revisit the implementation and discuss this problem with Wayland developers (see the "Wayland fd watching" thread in the Wayland dev ML at https://bit.ly/3nBagQW). The presented solution still fixes the deadlocks, and also presents a better and faster integration as it avoids constant context switches. TL;DR: Prepare/read/dispatch dance[1] must be integrated in the libglib and libevent in such a way that they are called in the correct order whenever they enter into a new loop cycle. That is, Libglib provides several callbacks via GSourceFuncs (such as prepare/check/dispatch/finalize), which are all called in a defined order. Failing to do correct calls might result in deadlocks if there are other clients like Wayland EGL who also want to read (please note Wayland EGL is only used when there are either no drm render nodes or no libgbm support or Wayland does not support necessary protocols, and --in-process-gpu flag is passed). Libevent also provides watcher callbacks (see [2]), but given Chromium still has been using old version of libevent, it is impossible to do correct integration now (see crbug.com/1288181). Instead, I have made it use old integration that we have been using for several years before adding a polling thread. Please note that LaCros uses Libevent and Linux uses Libglib by default. As a bonus, this CL also fixes 1172305 [1] https://bit.ly/3tCjobF [2] libevent/libevent#793 Bug: 1280424, 1287664, 1172305 Change-Id: I79d061efe64b89c14c8c898e4e61dbf8821efa4c Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3386963 Reviewed-by: Robert Kroeger <rjkroege@chromium.org> Commit-Queue: Maksim Sisov <msisov@igalia.com> Cr-Commit-Position: refs/heads/main@{#961827}

mergeconflict commented Mar 26, 2019

View reviewed changes

evwatcher.c Outdated Show resolved Hide resolved

mergeconflict commented Mar 26, 2019

View reviewed changes

evwatcher.c Outdated Show resolved Hide resolved

mergeconflict mentioned this pull request Mar 26, 2019

Loop duration and wakeup delay statistics #710

Closed

azat requested changes Mar 26, 2019

View reviewed changes

mergeconflict commented Mar 27, 2019

View reviewed changes

event.c Show resolved Hide resolved

azat requested changes Mar 27, 2019

View reviewed changes

azat reviewed Mar 28, 2019

View reviewed changes

event.c Outdated Show resolved Hide resolved

azat added the type:feature label Mar 28, 2019

mergeconflict force-pushed the prepare_and_check_watchers branch 3 times, most recently from 7b1755b to 0a4e5e0 Compare March 29, 2019 16:38

azat requested changes Mar 30, 2019

View reviewed changes

mergeconflict force-pushed the prepare_and_check_watchers branch from 0a4e5e0 to ec1b31b Compare April 1, 2019 17:13

azat requested changes Apr 1, 2019

View reviewed changes

sample/watch-timing.c Show resolved Hide resolved

sample/watch-timing.c Show resolved Hide resolved

sample/watch-timing.c Outdated Show resolved Hide resolved

sample/watch-timing.c Show resolved Hide resolved

mergeconflict force-pushed the prepare_and_check_watchers branch from ec1b31b to 71a7898 Compare April 1, 2019 21:01

mergeconflict force-pushed the prepare_and_check_watchers branch from 71a7898 to 956251d Compare April 2, 2019 19:56

mergeconflict force-pushed the prepare_and_check_watchers branch from 8d8a8f1 to 2f184f8 Compare April 3, 2019 16:45

azat merged commit 2f184f8 into libevent:master Apr 3, 2019

mergeconflict deleted the prepare_and_check_watchers branch April 3, 2019 18:35

mergeconflict mentioned this pull request Apr 8, 2019

event: add stats for loop duration and poll delay envoyproxy/envoy#6517

Merged

mergeconflict mentioned this pull request Apr 18, 2019

event: update libevent dependency to fix race condition envoyproxy/envoy#6637

Merged

htuch mentioned this pull request May 15, 2020

libevent replacement with libev envoyproxy/envoy#4952

Closed

Add "prepare" and "check" watchers #793

Add "prepare" and "check" watchers #793

Uh oh!

Conversation

mergeconflict commented Mar 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergeconflict left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergeconflict commented Mar 26, 2019

Uh oh!

coveralls commented Mar 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

azat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergeconflict commented Mar 27, 2019

Uh oh!

Uh oh!

azat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

azat commented Mar 28, 2019 via email

Uh oh!

azat commented Mar 28, 2019 via email

Uh oh!

azat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mergeconflict commented Mar 29, 2019

Uh oh!

mergeconflict commented Mar 29, 2019

Uh oh!

azat commented Mar 29, 2019 via email

Uh oh!

azat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

azat commented Apr 1, 2019 via email

Uh oh!

mergeconflict commented Apr 1, 2019

Uh oh!

azat left a comment

Choose a reason for hiding this comment

mergeconflict commented Mar 26, 2019 •

edited

Loading

coveralls commented Mar 26, 2019 •

edited

Loading

azat commented Apr 2, 2019 •

edited

Loading

azat commented Apr 3, 2019 •

edited

Loading