-
Notifications
You must be signed in to change notification settings - Fork 5.1k
event: add stats for loop duration and poll delay #6517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
event: add stats for loop duration and poll delay #6517
Conversation
Signed-off-by: Dan Rosen <mergeconflict@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions and comments for reviewers below.
/review @htuch @mattklein123
test/mocks/event/mocks.cc
Outdated
@@ -16,6 +16,7 @@ namespace Envoy { | |||
namespace Event { | |||
|
|||
MockDispatcher::MockDispatcher() { | |||
ON_CALL(*this, initializeStats(_)).WillByDefault(Return()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed for a handful of existing tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you going to write some new tests? I don't think we need to do anything super fancy with mock time etc., but if we could just run a dispatcher and then check some invariants, e.g. that the histograms are non-empty, the delays are >= expected, I think that would be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've written one in DispatcherImplTest
to verify that we create the expected stats. I haven't written anything to assert that the stats are populated, since I assumed that would be too flaky, but lemme think about that a bit more...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very concise and clear, I like it. Excited to see this appearing in Envoy!
/wait
@@ -41,6 +41,14 @@ DispatcherImpl::DispatcherImpl(Buffer::WatermarkFactoryPtr&& factory, Api::Api& | |||
|
|||
DispatcherImpl::~DispatcherImpl() {} | |||
|
|||
void DispatcherImpl::initializeStats(Stats::Scope& parent_scope) { | |||
std::string thread_id = api_.threadFactory().currentThreadId()->debugString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer not to use a debug string interface, because this is subject to change, i.e. there are no long term guarantees on debug representations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ThreadId
class doesn't really offer anything better currently. I could do a bit of additional spelunking and add a int64_t id() const
method there, but I also wonder if the underlying platform-specific thread IDs are hidden from the interface on purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is hidden on purpose. How do you think we can provide a stable ID given that debug is not guaranteed to be that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just assign workers an ordinal ID, e.g. [0, 1, 2, ..N-1]
for N workers and use that? The advantage is this will be consistent across platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A trivial way would be to just assign sequential IDs ("worker 1", "worker 2"...), which seems fine to me. But then you can't correlate anything going awry in these stats with anything you might see in the logs, which are tagged with actual thread IDs...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we have threads log their known name when they startup? That way, you can do the join, albeit with some additional work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, per @htuch, I now log the thread ID for debugging (in run
, below). Note that I can't do it here because initializeStats
isn't necessarily called from the same thread as the dispatcher's event loop.
test/mocks/event/mocks.cc
Outdated
@@ -16,6 +16,7 @@ namespace Envoy { | |||
namespace Event { | |||
|
|||
MockDispatcher::MockDispatcher() { | |||
ON_CALL(*this, initializeStats(_)).WillByDefault(Return()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you going to write some new tests? I don't think we need to do anything super fancy with mock time etc., but if we could just run a dispatcher and then check some invariants, e.g. that the histograms are non-empty, the delays are >= expected, I think that would be sufficient.
Signed-off-by: Dan Rosen <mergeconflict@google.com>
/retest |
🔨 rebuilding |
Signed-off-by: Dan Rosen <mergeconflict@google.com>
include/envoy/event/dispatcher.h
Outdated
* Start writing stats once thread-local storage is ready to receive them (see | ||
* ThreadLocalStoreImpl::initializeThreading). | ||
* @param parent_scope the parent scope for the new per-dispatcher stats scope created here. | ||
* @param name the name for the new stats scope, identifying this dispatcher. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens for stats for dispatchers that are allocated and run without calling this?
I think this might be OK but just specify it here. Auto-name them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I've put in the docs, at least for now, is that we only record event loop stats for the main thread and workers, not any other auxiliary threads. I assume other threads might be transient, and it would just be noisy to include stats for them automatically. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you have is fine for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's OK, but can we (a) document that here and (b) have a stat for the number of anonymous untracked threads with dispatchers?
You can also survey source/.../*.cc for references to allocateDispatcher(). There's at least one you should probably add, in server/guarddog_impl.cc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool. Flushing out some misc comments.
/wait-any
include/envoy/event/dispatcher.h
Outdated
* Start writing stats once thread-local storage is ready to receive them (see | ||
* ThreadLocalStoreImpl::initializeThreading). | ||
* @param parent_scope the parent scope for the new per-dispatcher stats scope created here. | ||
* @param name the name for the new stats scope, identifying this dispatcher. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you have is fine for now.
// feeling saucy. Disregard negative delays in stats, since they don't indicate anything | ||
// particularly useful. | ||
if (delay.tv_sec >= 0) { | ||
self->stats_->poll_delay_us_.recordValue(delay.tv_sec * 1000000 + delay.tv_usec); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to not just use `std::chrono`` stuff in this file to avoid this math?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The honest answer is, I didn't know it existed :) But it looks like converting from timeval
to the std::chrono
stuff involves doing that math anyway. I'll factor it out into a helper method, though, just to keep it isolated and eliminate the duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I think what I'm proposing is to not use any C style time stuff at all here as and just use a TimeSource
interface which can give you monotonic time, and then you should be able to use std::chrono::microseconds
etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could get rid of the evutil_gettimeofday
calls, but I can't replace the evwatch_prepare_get_timeout
call -- that gives me the timeout that it's internally decided to pass into epoll/select/whatever.
@@ -137,6 +138,18 @@ void ThreadLocalStoreImpl::initializeThreading(Event::Dispatcher& main_thread_di | |||
tls_->set([](Event::Dispatcher&) -> ThreadLocal::ThreadLocalObjectSharedPtr { | |||
return std::make_shared<TlsCache>(); | |||
}); | |||
|
|||
// Initialize stats for each thread's dispatcher. This can't be done in the InitializeCb passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on whether this is the right place to do this? Should we instead be initializing worker stats in the server or listener manager when we create the workers? It feels that might be the right place vs. backing this into the store itself? WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. The way my initial draft implementation worked was, the DispatcherStats
object for each DispatcherImpl
was initialized in its constructor, rather than allocated on the heap in initializeStats()
. I still needed to have a separate initializeStats()
method, though, to indicate when it's safe to start actually writing to them, because the workers are allocated before TLS::initializeThreading()
is called (meaning there's no slot to write to yet).
The main thing I didn't like about that approach was just that it required plumbing the parent Stats::Scope
into DispatcherImpl
's ctor, which ended up touching a lot of stuff (e.g. ValidationDispatcher
, Api::allocateDispatcher()
, etc.).
Having said that, it still might be the right way to go, if we think it's useful for all dispatchers (not just the main and worker threads) to have these stats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I think what I'm proposing is what you are doing, but not initializing dispatcher stats inside the thread local store which seems not quite right to me. I would probably actually do this at the server level after the workers are created and initializeThreading
is called. Then you could iterate through all workers and initialize dispatcher stats on the worker. I think this would make it a lot more clear that we are only doing this on workers, and then in the future we could do it on other dispatchers if we want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think I understand what you're suggesting. The workers are all managed by the listener manager, so I'll have to do a little plumbing there, but that shouldn't be bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, plumbing is added in this revision. Stats are now initialized for each worker in ListenerManagerImpl::startWorkers
, which seems more appropriate.
I thought the challenge here was we got the C time struct from libevent now
so it has to be converted to c++ time somewhere. We could centralize that
though
…On Fri, Apr 12, 2019, 11:47 AM Matt Klein ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In source/common/stats/thread_local_store.cc
<#6517 (comment)>:
> @@ -137,6 +138,18 @@ void ThreadLocalStoreImpl::initializeThreading(Event::Dispatcher& main_thread_di
tls_->set([](Event::Dispatcher&) -> ThreadLocal::ThreadLocalObjectSharedPtr {
return std::make_shared<TlsCache>();
});
+
+ // Initialize stats for each thread's dispatcher. This can't be done in the InitializeCb passed
Sorry I think what I'm proposing is what you are doing, but not
initializing dispatcher stats inside the thread local store which seems not
quite right to me. I would probably actually do this at the server level
after the workers are created and initializeThreading is called. Then you
could iterate through all workers and initialize dispatcher stats on the
worker. I think this would make it a lot more clear that we are only doing
this on workers, and then in the future we could do it on other dispatchers
if we want to.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#6517 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AB2kPfbkBAe3of5MEejrrZHAYS8E6gt9ks5vgKqRgaJpZM4cjNZN>
.
|
If that's the case than nevermind, I thought we were just getting event callbacks and calling gettimeofday ourselves in which case we could do whatever we want. |
Signed-off-by: Dan Rosen <mergeconflict@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good modulo Matt's comments and a couple I have. Very clean.
/wait
|
||
// Delay can be negative, meaning polling completed early. This happens in normal operation, | ||
// either because I/O was ready before we hit the timeout, or just because the kernel was | ||
// feeling saucy. Disregard negative delays in stats, since they don't indicate anything |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we discard or record a 0 delay when negative? I'm thinking the latter could be good to ensure we don't get anomalies like the following. Imagine there is a slight negative delay normally, and you see this on a million iterations. But, scattered amongst these, we see 10 iterations where there is an elevated delay. These are really only a tiny fraction and don't indicate a huge problem, but they will appear as 100% with elevated delay in the histogram if we discard here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recording the early wakeups as 0 delay would give you a number that means something different from what we're looking for. It might be worth recording as well (although I'm not sure what name I'd give it), but the thing we care about most here is specifically the distribution of positive delays.
One thing I tried to be very clear about in the docs is that delays are normal; they don't indicate a problem if they're within some expected range. For example, if you see the poll_delay_us
number floating around a millisecond (or whatever your time slice is), with no long tail, then you're happy. It doesn't mean that every iteration of the event loop has a millisecond delay, it just means that when there is a delay, it's reasonably bounded.
If we included early delays as zeroes, it would skew the histogram such that 0 appears to be the correct ideal delay, and that the 1ms delay appears to be problematic tail latency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm struggling with here is the difference between -1ms, 0ms and 1ms delays. To me, they seem basically the same thing, the kernel woke you up roughly when you expected. Or, is there something special about -1ms vs 0ms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah. The main difference is that libevent treats a just-slightly-too-early wakeup the same as a way-too-early wakeup: it doesn't process the timeout event and just goes back to sleep. It sometimes does this a few iterations in a row, sleeping for 0us on each, before finally sleeping for a bit longer.
One way to avoid the difference is to have a configurable minimum threshold. For example, don't record any delays less than 100us, because they're not interesting. But I don't know if I'd actually recommend implementing that, at least not until we've observed how this behaves in production.
@@ -8,12 +8,17 @@ | |||
#include "common/stats/isolated_store_impl.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did we end up with the "should new tests be written?" thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We ended up with me dragging my heels. I have the one new unit test here, but I guess it could be worthwhile to add an integration test somewhere, asserting that the stats we expect to populate are actually there...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this could easily be added to something like server_test
or one of the integration test. Verify that we have the server and worker histogram stats available. You could also consider posting an event to one of the dispatchers that sleeps for 100ms (real time, not simulated, since that's what libevent uses) and verifying that the epoll duration is at least 100ms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, my first attempt at adding something to server_test
was a non-starter: see #6582 for details. I'll try adding to an integration test instead, I may have better luck with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is fine, but please do track as I think we have some functional validation missing.
Signed-off-by: Dan Rosen <mergeconflict@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is going to be very useful to operators.
This reverts commit 77748b2. Signed-off-by: Matt Klein <mklein@lyft.com>
Description: Add per-thread dispatcher statistics for loop duration and poll delay, based on new "prepare" and "check" watchers added in libevent (libevent/libevent#793). See discussion in #4952.
Risk Level: medium
Testing: Added unit test, all existing tests pass, and running locally yields sane results.
Docs Changes: Added a new page on "performance" discussing event loop.
Release Notes: Added an entry noting new stats.
Signed-off-by: Dan Rosen mergeconflict@google.com