Skip to content

Separating logging from tracing #607

@thanethomson

Description

@thanethomson

At present, our logs are somewhat problematic in that they don't provide a lot of value to operators, and they simultaneously don't provide a lot of value to our team. It appears as though the logs have a mix of target audiences: mostly core developers, but secondarily operators.

One option is to use different log levels to output different kinds of information (e.g. operator-focused logs from info level, but developer-focused logs at debug level).

My alternative proposal here is to totally separate the logging/tracing output for each of these audiences:

  1. Let logs exclusively target operators, providing them with actionable insights into what's going on in the system, especially if there are failures that require operator intervention to recover from.
  2. Introduce traces - a human- and machine-readable output format that is turned off by default, and provides far more detailed/fine-grained information, when turned on, to:
    1. Allow core developers to more easily troubleshoot systems under test or even systems in production.
    2. Potentially help facilitate model-based testing via our E2E tests, since these traces could potentially be machine-readable.

The specific trace format that @josef-widder recommended looking into here is Informal Trace Format (cc @konnov, @shonfeder). Traces could be exposed either by way of special trace JSON files, or via some form of streaming RPC/gRPC endpoint. We already have an ITF parser written in Rust here: https://github.com/informalsystems/itf-rs

If this would potentially be a reasonable solution to help us understand the system better under different conditions, then the first deliverable here would be an ADR describing the solution in more detail, as well as a PoC implementation (perhaps starting with traces for the consensus and/or mempool reactors' operations).

Metadata

Metadata

Assignees

Labels

P:consensus-engine-devsPriority: Better support consensus engine developersP:operator-experiencePriority: Improve experience for operatorsenhancementNew feature or requestlogsAnything relating to logging

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions