-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
This is a project proposed as part of LFX Mentorship term #6470 ⬅ read this first.
Summary
Currently, Jaeger uses a v1 Storage API, which operates on a data model specific to Jaeger. Each storage backend implements this API, requiring transformations between Jaeger's proprietary model and the OpenTelemetry Protocol (OTLP) data model, which is now the industry standard.
As part of #5079, Jaeger has introduced the more efficient v2 Storage API, which natively supports the OpenTelemetry data model (OTLP), allows batching of writes and streaming of results. This effort is part of a broader alignment with the OpenTelemetry Collector framework, tracked under #4843.
Objective
Upgrade Jaeger storage backends to natively implement the v2 Storage API.
- Memory
- Elasticsearch / OpenSearch
- Badger
- Cassandra
- gRPC / Remote
The chosen storage backend should be upgraded to fully implement the v2 Storage API in place. For a rough idea of how to upgrade from the v1 model to the OTLP data model, take a look at the PRs in the following issues that do a similar upgrade for other components of Jaeger:
- Transform v1 sanitizers into processors or something else #5545
- Implement adjusters to operate on OTLP data format #6344
Desired Outcomes
Upgrade Memory and Elasticsearch backends
We prioritize these two backends as they are the mostly frequently used with Jaeger and upgrading them paves a path for upgrading other backends.
Testing
- The storage implementations should be fully covered by unit tests
- There are already integration tests for the storage backend so all of them should pass without needing to be modified
Bonus: Upgrade Other Backends
If time permits, upgrade Badger and Cassandra storage backends.
Risks / Open Questions
- The v2 storage API doesn't have a distinction between primary and archive storage but v1 does. The ultimate plan is to remove the archive storage from the v1 implementation as well. That work effort is being tracked in Phase out the distinction between primary and archive storage #6065. We may want to think about how to handle the upgrades for the storages that implement the archive storage in v1 while we work on removing it. We may want to simply ignore the archive part of the storage while we resolve the aforementioned issue if that is possible.