-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Description
This issue assumes that the slog proposal will be accepted.
Background
Adding dynamically scoped information to log output greatly aids debugging: it allows logs to be grouped by server request, and gives readers valuable clues about what was happening along the control flow path that led to the log event. One important piece of information is the trace ID, which is added to the context by tracing systems like OpenTelemetry to construct tracing graphs, a valuable performance and debugging tool. If a log event contains a trace ID, it can be used to annotate the trace in the graph, making it easier to see what was going on in the trace. We’ll focus on the trace ID in this issue for concreteness, but whatever solution emerges from this discussion should work for any value in a context.Context
.
In OpenTelemetry, trace IDs are contained in spans. A trace is started with a call like
ctx, span := trace.Start(ctx, name, ...)
That call starts a new span, makes it a child of the currently active span (if any), and returns a context containing the span as well as the span itself. It is typically followed by
defer span.End()
which ends the trace. In a server, a typical trace will be an entire request, but nested traces can be established anywhere along the call chain.
The problem
The goal of this issue is to find a good way to convey the context to a slog.Logger, and ultimately to a slog.Handler, so the trace ID and other context values can appear in the output. We’ll assume the programmer has already arranged to pass the context itself around, so that all logging occurs inside a function with a context:
func f(ctx context.Context, ...) {
How can we encourage the programmer to include the context in their logging statements?
The key word here is encourage. The slog
package already has a way to include the context:
logger.WithContext(ctx).Info(msg)
but programmers are likely to prefer the shorter
logger.Info(msg)
We also need to consider the related problem of how the logger itself is passed around. Although slog
has a global logger, that logger will not have attributes that may have been added along the call chain with lines like
logger = logger.With("key", "value")
So programs that care about dynamically scoped data will have to pass the logger down the stack along with the context.
Loggers in contexts
slog
's original solution to this problem was to include the logger itself in the context. NewContext
would add the logger to the context:
ctx = slog.NewContext(ctx, logger)
FromContext
would extract it:
logger := slog.FromContext(ctx)
Callers would still have to write
logger.WithContext(ctx).Info(msg)
but we also included a convenience function, Ctx
, that combined FromContext
and WithContext
:
slog.Ctx(ctx).Info(msg)
Before we reject this solution and look elsewhere, let’s see what’s good about it. It simplifies the code, because instead of passing around both a context and logger, only the context needs to be passed. And it encourages programmers to include the context in their log output, because slog.Ctx(ctx)
is a more compact way to obtain a logger than slog.FromContext(ctx)
.
The author of a tracing system could make it even more likely that log output would contain trace IDs, by modifying the Start
method that adds a span to a context. That code could also add the trace ID to the logger in the context:
func Start(ctx context.Context, ...) (context.Context, Span) {
...
logger := slog.FromContext(ctx)
logger = logger.With("traceID", traceID)
return slog.NewContext(ctx, logger), span
}
Now the trace information would be logged as long as the programmer retrieved the logger from the context.
Controversy
Some comments on the proposal issue were in favor of including a logger in a context: @v3n, @rliebz, @darkfeline, @neild @dottedmag, @tigrannajaryan 1, 2, 3, @Aneurysm9.
But others were strongly against it: @prochac 1, 2, 3, @mminklet 1, 2, 3, @rabbit, @jellevandenhooff, @ChrisHines.
The main objection is that putting a logger in a context creates a hidden dependency. When you call a function that takes a context, you don’t know whether the context is for logging, cancellation, or both. Context arguments are added to functions just to allow logging. The objectors understand that adding loggers to contexts might make sense for some applications, but oppose adding it to the standard library for fear of creating a precedent that will lead to widespread abuse of contexts.
Because of these concerns, we removed NewContext
, FromContext
and Ctx
from the proposal. So to obtain a logger that may contain dynamically scoped information, a function must take it as an argument somehow, either directly or as a part of another argument, like a receiver field.
Alternatives
If we don’t add loggers to contexts, then how can we encourage programmers to pass contexts to loggers?
We could do nothing to the API, but educate users to write
logger = logger.WithContext(ctx)
at the top of their functions. This might not seem much different than remembering to write logger := slog.FromContext(ctx)
, but note that previously, FromContext
was the only reasonable way to get a logger with dynamic content. Now we assume a logger is already available to the function, so nothing pushes the programmer towards adding the context to it.
Another popular idea is to require each log output method to take a context:
func (*Logger) Info(ctx context.Context, message string, ...)
...
We don’t want to do that, because it goes too far in the other direction, forcing programmers to pass in a context (or at least consider doing so) for every log message.
A related idea is to add context-aware functions to the ones already present:
func (*Logger) Info(message string, ...) // already in the proposal
func (*Logger) InfoCtx(ctx context.Context, message string, ...)
...
We also don’t want to do that, because it greatly increases the size of the API. It’s also not much different from the existing WithContext
method, especially if we shortened the name to Ctx
:
logger.InfoCtx(ctx, "message")
logger.Ctx(ctx).Info("message")
Another variant of this idea is an adapter type, like @jellevandenhooff’s slogctx package. You would then write
clogger := slogctx.NewLogger(logger)
...
clogger.Info(ctx, "message")
All of these solutions require work in the function doing the logging. What if we could move that work up the call stack? The caller could add the context to the logger whenever the context changed:
// Add a span to the context.
ctx, span := tracer.Start(ctx, ...)
defer span.End()
// Add the context to a logger.
logger = logger.WithContext(ctx)
// Pass the logger down.
f(ctx, logger)
Now f
can use the logger directly and still get trace IDs in the output. Since context changes occur relatively infrequently, this convention should reduce the total amount of programmer toil. More importantly, it puts the responsibility in the hands of the programmer who understands the importance of tracing, and how trace IDs are propagated through contexts.
But we can do even better by using an idea first suggested on the proposal issue by @rabbit. If we are willing to mutate a part of the context, we can avoid having to repeatedly incorporate the context into the logger. I sketched an implementation for OpenTelemetry that associates a list of spans with a context key. Starting a new span appends it to the list, and ending a span removes it from the list. Once a logger contains a context with the list, it can be used repeatedly. That means that logger.WithContext
needs to be called only for the top-level span. Starting a nested span can be done without updating the logger, since doing so mutates (a part of) the context that the logger already contains.
There is one difference in behavior between the purely functional implementation of adding spans to contexts and the mutating one: if a context is discarded (as by function return) without calling span.End
, the functional implementation will drop the span while the mutating one will not. I’m not sure if that’s a bug or a feature.
[Edit: mutating the list of spans in the context does the wrong thing if spans don't follow a stack discipline. It is not recommended for general use.]
Do any of these alternatives adequately ensure that most log events contain trace information? Are there other ideas I’ve missed? Is there anything we can add to or change about slog
to help?