New bufferer component #5330

mapno · 2025-06-27T11:47:55Z

What this PR does:

The bufferer is a Kafka-based ingestion service that consumes trace data from Kafka topics instead of receiving it via gRPC like the traditional ingester. It reads serialized traces from Kafka, processes them through the same data pipeline as the ingester (live traces → head blocks → WAL → complete blocks).

                      ┌─────────────────────────────┐
                      │        Bufferer             │
                      │                             │
                      │                             │
                      │ • Manages instances map     │
                      │ • Shared WAL coordination   │
                      │ • Flush queue management    │
                      └─────────────┬───────────────┘
                                    │
                      ┌─────────────┴───────────────┐
                      │                             │
                      ▼                             ▼
          ┌─────────────────────┐       ┌─────────────────────┐
          │  PartitionReader    │       │   instances map     │
          │                     │       │   [tenantID]        │
          │ • Kafka consumer    │       │                     │
          │ • Watermark commits │       │ ┌─────────────────┐ │
          │ • Routes to consume │       │ |   instance A    │ │
          │   function          │       │ │   (tenant A)    │ │
          └─────────────────────┘       │ │                 │ │
                                        │ │ • Live traces   │ │
                                        │ │ • Head block    │ │
                                        │ │ • WAL blocks    │ │
                                        │ │ • Complete      │ │
                                        │ │   blocks        │ │
                                        │ └─────────────────┘ │
                                        │                     │
                                        │ ┌─────────────────┐ │
                                        │ │   instance B    │ │
                                        │ │   (tenant B)    │ │
                                        │ │                 │ │
                                        │ │ • Live traces   │ │
                                        │ │ • Head block    │ │
                                        │ │ • WAL blocks    │ │
                                        │ │ • Complete      │ │
                                        │ │   blocks        │ │
                                        │ └─────────────────┘ │
                                        └─────────────────────┘
                                                   │
                                                   ▼
                                        ┌─────────────────────┐
                                        │     Shared WAL      │
                                        │                     │
                                        │ • Single WAL for    │
                                        │   all instances     │
                                        │ • Coordinated       │
                                        │   block cutting     │
                                        └─────────────────────┘

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

fmt and manifest

mattdurham · 2025-07-08T14:32:07Z

In general does it make sense to move responsibility of timing and actions from the Bufferer to the individual instances? This would make the instances more self contained and potentially avoid a good chunk of the locking. Would be harder to do global cuts but we could do tenant specific then? Or trigger via a channel with at timer in the Bufferer.

mattdurham · 2025-07-08T15:07:03Z

From chatting would calling Bufferer ReadCache make more sense?

mattdurham · 2025-07-15T15:33:50Z

Can we document the failure modes and what happens? Do we need 100% data fidelity or can 99.9%, since the data will be correctly handled by long term storage regardless.

mattdurham · 2025-07-15T15:34:42Z

Assuming failures happen is it better to return incorrect data or no data?

mapno · 2025-07-21T15:24:32Z

Development has been moved to a dev branch. See #5430

mapno force-pushed the rhythm-bufferer branch from 3511f27 to 5ec0217 Compare July 7, 2025 16:10

mapno added 4 commits July 8, 2025 15:26

Base bufferer service

aaf8837

Basic Kafka functionality

dfe10c3

fmt and manifest

Basic block functionality

3c639e1

Align cutting with Kafka consumption

e0a39e5

mapno force-pushed the rhythm-bufferer branch from 8f2c0bb to dc5a73e Compare July 8, 2025 13:26

Make bufferer multi-tenant

33fb555

mapno force-pushed the rhythm-bufferer branch from dc5a73e to 33fb555 Compare July 8, 2025 13:41

Fix manifest

ea8b528

mapno added 7 commits July 18, 2025 21:14

Commit/watermark + concurrent cutting

60073cd

Remove per-tenant watermark code

9397891

Add integration tests

3635522

Merge remote-tracking branch 'origin/main' into rhythm-bufferer

a9a6ee7

fmt

7a4fdd5

bufferer -> live-store

d7b6368

fmt more

553bf3d

mapno mentioned this pull request Jul 21, 2025

New live-store #5430

Merged

3 tasks

mapno closed this Jul 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New bufferer component #5330

New bufferer component #5330

Uh oh!

mapno commented Jun 27, 2025 •

edited

Loading

Uh oh!

mattdurham commented Jul 8, 2025

Uh oh!

mattdurham commented Jul 8, 2025

Uh oh!

mattdurham commented Jul 15, 2025

Uh oh!

mattdurham commented Jul 15, 2025

Uh oh!

mapno commented Jul 21, 2025

Uh oh!

Uh oh!

New bufferer component #5330

New bufferer component #5330

Uh oh!

Conversation

mapno commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattdurham commented Jul 8, 2025

Uh oh!

mattdurham commented Jul 8, 2025

Uh oh!

mattdurham commented Jul 15, 2025

Uh oh!

mattdurham commented Jul 15, 2025

Uh oh!

mapno commented Jul 21, 2025

Uh oh!

Uh oh!

mapno commented Jun 27, 2025 •

edited

Loading