Skip to content

Conversation

rnburn
Copy link
Contributor

@rnburn rnburn commented Jul 30, 2018

Description:
Changes to use a larger default size for LightStep reports sent to satellites. Fixes lightstep/lightstep-tracer-cpp#106

Risk Level: Low

rnburn added 2 commits July 27, 2018 19:14
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
@rnburn
Copy link
Contributor Author

rnburn commented Jul 30, 2018

cc @jmacd

@ccaraman
Copy link
Contributor

@rnburn Please fix tests and ping once ready for a review.

rnburn added 4 commits July 30, 2018 11:40
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
@rnburn
Copy link
Contributor Author

rnburn commented Jul 30, 2018

@ccaraman - should be ready for review now.

[this] { return runtime_.snapshot().getInteger("tracing.lightstep.min_flush_spans", 5U); }};

tls_options.max_buffered_spans = std::function<size_t()>{[this] {
return runtime_.snapshot().getInteger("tracing.lightstep.min_flush_spans",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be any issues experienced by users when switching to this higher value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We (LightStep) are currently experiencing issues resulting from excessively small Envoy reports. Envoy will consume more memory buffering requests, but have less overhead sending small RPCs to LightStep satellites.

There isn't an ideal setting here, but the current default is an order of magnitude or two out of line with what we'd like.

// creates separate tracers for each thread.
//
// See https://github.com/lightstep/lightstep-tracer-cpp/issues/106
const size_t LightStepDriver::default_min_flush_spans =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the concurrency value passed in through the command line be used first then default to the hardware concurrency if not specified?

TCLAP::ValueArg<uint32_t> concurrency("", "concurrency", "# of worker threads to run", false,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rnburn I gave this more thought and now believe we should just fill in a constant here. Envoy's architecture is that every CPU is an independent worker, so as long as the amount of memory available scales with the number of processors, there's no reason to limit the buffering of LightStep spans according to processor count.

Will you adjust this to use a fixed constant of 200? That avoids the question Constance asks here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to use 200

[this] { return runtime_.snapshot().getInteger("tracing.lightstep.min_flush_spans", 5U); }};

tls_options.max_buffered_spans = std::function<size_t()>{[this] {
return runtime_.snapshot().getInteger("tracing.lightstep.min_flush_spans",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We (LightStep) are currently experiencing issues resulting from excessively small Envoy reports. Envoy will consume more memory buffering requests, but have less overhead sending small RPCs to LightStep satellites.

There isn't an ideal setting here, but the current default is an order of magnitude or two out of line with what we'd like.

// creates separate tracers for each thread.
//
// See https://github.com/lightstep/lightstep-tracer-cpp/issues/106
const size_t LightStepDriver::default_min_flush_spans =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rnburn I gave this more thought and now believe we should just fill in a constant here. Envoy's architecture is that every CPU is an independent worker, so as long as the amount of memory available scales with the number of processors, there's no reason to limit the buffering of LightStep spans according to processor count.

Will you adjust this to use a fixed constant of 200? That avoids the question Constance asks here.

Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
ccaraman
ccaraman previously approved these changes Aug 2, 2018
Copy link
Contributor

@ccaraman ccaraman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 one small nit

@@ -60,6 +60,8 @@ class LightStepDriver : public Common::Ot::OpenTracingDriver {
Runtime::Loader& runtime() { return runtime_; }
LightstepTracerStats& tracerStats() { return tracer_stats_; }

static const size_t default_min_flush_spans;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: static const vars are all upper case

@ccaraman
Copy link
Contributor

ccaraman commented Aug 2, 2018

@htuch final pass please - LGTM - just one nit

Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
mattklein123
mattklein123 previously approved these changes Aug 2, 2018
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, two nits.

@@ -60,6 +60,8 @@ class LightStepDriver : public Common::Ot::OpenTracingDriver {
Runtime::Loader& runtime() { return runtime_; }
LightstepTracerStats& tracerStats() { return tracer_stats_; }

static const size_t DEFAULT_MIN_FLUSH_SPANS;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Prefer DefaultMinFlushSpans for Envoy recommended constant style.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

@@ -36,6 +36,13 @@ void LightStepLogger::operator()(lightstep::LogLevel level,
LightStepDriver::LightStepTransporter::LightStepTransporter(LightStepDriver& driver)
: driver_(driver) {}

// If the default min_flush_spans value is too small, the larger number of reports can overwhelm
// LightStep's sattelites. Hence, we need to choose a number that's large enough; though, it's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: typo sattelites

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
@htuch htuch merged commit 7b03f2e into envoyproxy:master Aug 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants