-
Notifications
You must be signed in to change notification settings - Fork 5.1k
tracing: Fixes issue with small LightStep reports. #3989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
cc @jmacd |
@rnburn Please fix tests and ping once ready for a review. |
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
@ccaraman - should be ready for review now. |
[this] { return runtime_.snapshot().getInteger("tracing.lightstep.min_flush_spans", 5U); }}; | ||
|
||
tls_options.max_buffered_spans = std::function<size_t()>{[this] { | ||
return runtime_.snapshot().getInteger("tracing.lightstep.min_flush_spans", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be any issues experienced by users when switching to this higher value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We (LightStep) are currently experiencing issues resulting from excessively small Envoy reports. Envoy will consume more memory buffering requests, but have less overhead sending small RPCs to LightStep satellites.
There isn't an ideal setting here, but the current default is an order of magnitude or two out of line with what we'd like.
// creates separate tracers for each thread. | ||
// | ||
// See https://github.com/lightstep/lightstep-tracer-cpp/issues/106 | ||
const size_t LightStepDriver::default_min_flush_spans = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the concurrency value passed in through the command line be used first then default to the hardware concurrency if not specified?
envoy/source/server/options_impl.cc
Line 53 in c92a301
TCLAP::ValueArg<uint32_t> concurrency("", "concurrency", "# of worker threads to run", false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rnburn I gave this more thought and now believe we should just fill in a constant here. Envoy's architecture is that every CPU is an independent worker, so as long as the amount of memory available scales with the number of processors, there's no reason to limit the buffering of LightStep spans according to processor count.
Will you adjust this to use a fixed constant of 200? That avoids the question Constance asks here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update to use 200
[this] { return runtime_.snapshot().getInteger("tracing.lightstep.min_flush_spans", 5U); }}; | ||
|
||
tls_options.max_buffered_spans = std::function<size_t()>{[this] { | ||
return runtime_.snapshot().getInteger("tracing.lightstep.min_flush_spans", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We (LightStep) are currently experiencing issues resulting from excessively small Envoy reports. Envoy will consume more memory buffering requests, but have less overhead sending small RPCs to LightStep satellites.
There isn't an ideal setting here, but the current default is an order of magnitude or two out of line with what we'd like.
// creates separate tracers for each thread. | ||
// | ||
// See https://github.com/lightstep/lightstep-tracer-cpp/issues/106 | ||
const size_t LightStepDriver::default_min_flush_spans = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rnburn I gave this more thought and now believe we should just fill in a constant here. Envoy's architecture is that every CPU is an independent worker, so as long as the amount of memory available scales with the number of processors, there's no reason to limit the buffering of LightStep spans according to processor count.
Will you adjust this to use a fixed constant of 200? That avoids the question Constance asks here.
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 one small nit
@@ -60,6 +60,8 @@ class LightStepDriver : public Common::Ot::OpenTracingDriver { | |||
Runtime::Loader& runtime() { return runtime_; } | |||
LightstepTracerStats& tracerStats() { return tracer_stats_; } | |||
|
|||
static const size_t default_min_flush_spans; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: static const vars are all upper case
@htuch final pass please - LGTM - just one nit |
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, two nits.
@@ -60,6 +60,8 @@ class LightStepDriver : public Common::Ot::OpenTracingDriver { | |||
Runtime::Loader& runtime() { return runtime_; } | |||
LightstepTracerStats& tracerStats() { return tracer_stats_; } | |||
|
|||
static const size_t DEFAULT_MIN_FLUSH_SPANS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Prefer DefaultMinFlushSpans
for Envoy recommended constant style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed.
@@ -36,6 +36,13 @@ void LightStepLogger::operator()(lightstep::LogLevel level, | |||
LightStepDriver::LightStepTransporter::LightStepTransporter(LightStepDriver& driver) | |||
: driver_(driver) {} | |||
|
|||
// If the default min_flush_spans value is too small, the larger number of reports can overwhelm | |||
// LightStep's sattelites. Hence, we need to choose a number that's large enough; though, it's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: typo sattelites
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
Description:
Changes to use a larger default size for LightStep reports sent to satellites. Fixes lightstep/lightstep-tracer-cpp#106
Risk Level: Low