Skip to content

Conversation

mahadzaryab1
Copy link
Collaborator

@mahadzaryab1 mahadzaryab1 commented Jun 7, 2025

Which problem is this PR solving?

Description of the changes

  • This PR implements the GetTraces function for ClickHouse storage.
  • This PR also introduces a schema that will later be used to auto-create the required tables upon instantiation of the ClickHouse factory.

How was this change tested?

Started the ClickHouse server on my local machine using

./clickhouse server

Initialized the tables using

/Users/mzaryab/clickhouse client --multiquery < schema.sql

Seeded the database using test.sql

/Users/mzaryab/clickhouse client --multiquery < test.sql

Wrote a Go unit test to run GetTraces and got the following output

Span: POST /api/order, Start: 2025-06-08 00:58:05 +0000 UTC, Duration: 2.5s
Event: checkout, Timestamp: 2025-06-08 00:58:03 +0000 UTC
Event: payment, Timestamp: 2025-06-08 00:58:04 +0000 UTC
Link: TraceID: 00000000000000000000000000000001, SpanID: 0000000000000001, TraceState: state1
Link: TraceID: 00000000000000000000000000000003, SpanID: 0000000000000003, TraceState: state1

Checklist

Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@mahadzaryab1 mahadzaryab1 added the changelog:exprimental Change to an experimental part of the code label Jun 7, 2025
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Copy link

codecov bot commented Jun 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.18%. Comparing base (bafa507) to head (cbf127c).
Report is 16 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7207      +/-   ##
==========================================
+ Coverage   96.17%   96.18%   +0.01%     
==========================================
  Files         366      369       +3     
  Lines       21947    22184     +237     
==========================================
+ Hits        21107    21338     +231     
- Misses        627      632       +5     
- Partials      213      214       +1     
Flag Coverage Δ
badger_v1 9.85% <ø> (+<0.01%) ⬆️
badger_v2 1.89% <ø> (-0.01%) ⬇️
cassandra-4.x-v1-manual 14.81% <ø> (-0.01%) ⬇️
cassandra-4.x-v2-auto 1.88% <ø> (-0.01%) ⬇️
cassandra-4.x-v2-manual 1.88% <ø> (-0.01%) ⬇️
cassandra-5.x-v1-manual 14.81% <ø> (-0.01%) ⬇️
cassandra-5.x-v2-auto 1.88% <ø> (-0.01%) ⬇️
cassandra-5.x-v2-manual 1.88% <ø> (-0.01%) ⬇️
elasticsearch-6.x-v1 20.77% <ø> (+0.12%) ⬆️
elasticsearch-7.x-v1 20.85% <ø> (+0.12%) ⬆️
elasticsearch-8.x-v1 21.03% <ø> (+0.12%) ⬆️
elasticsearch-8.x-v2 1.89% <ø> (-0.01%) ⬇️
grpc_v1 11.38% <ø> (-0.01%) ⬇️
grpc_v2 1.89% <ø> (-0.01%) ⬇️
kafka-3.x-v1 10.20% <ø> (-0.01%) ⬇️
kafka-3.x-v2 1.89% <ø> (-0.01%) ⬇️
memory_v2 1.89% <ø> (-0.01%) ⬇️
opensearch-1.x-v1 20.90% <ø> (+0.12%) ⬆️
opensearch-2.x-v1 20.90% <ø> (+0.12%) ⬆️
opensearch-2.x-v2 1.89% <ø> (-0.01%) ⬇️
query 1.89% <ø> (-0.01%) ⬇️
tailsampling-processor 0.52% <ø> (+0.01%) ⬆️
unittests 95.02% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@mahadzaryab1 mahadzaryab1 changed the title [WIP] Clickhouse POC [WIP] [clickhouse] Implement GetTraces for ClickHouse Storage Jun 8, 2025
scope_version String
)
ENGINE = MergeTree
PRIMARY KEY(trace_id);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add any other primary keys here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trace_id cannot be primary key, since we have many spans with the same ID.

In Cassandra we used (trace_id, span_id, span_hash) as PK, since we never actually retrieve by PK, only by partial PK scan like (trace_id, *). Adding a hash allows storing partial spans (even though we don't fully support that).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro Primary keys in ClickHouse work differently and are not required to be unique per row. The primary key instead acts like an index to reduce the amount of data that is scanned (more info at https://clickhouse.com/docs/migrations/postgresql/data-modeling-techniques#how-are-clickhouse-primary-keys-different).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the guidance that ClickHouse gives on choosing an effective primary key (ref: https://clickhouse.com/docs/best-practices/choosing-a-primary-key)

When selecting an ordering key, prioritize columns frequently used in query filters (i.e. the WHERE clause), especially those that exclude large numbers of rows.
Columns highly correlated with other data in the table are also beneficial, as contiguous storage improves compression ratios and memory efficiency during GROUP BY and ORDER BY operations.

Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
PRIMARY KEY(name);

CREATE MATERIALIZED VIEW IF NOT EXISTS services_mv
TO services
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not following, what does TO services mean? Can't materializes view exist by itself, derived from spans table?

Copy link
Collaborator Author

@mahadzaryab1 mahadzaryab1 Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro Materialized views require a target table, which in this case is services. It is derived from the spans table and stored in the services table. See the following from ClickHouse documentation (ref: https://clickhouse.com/docs/best-practices/use-materialized-views)

Incremental materialized views are updated in real-time. As new data is inserted into the source table, ClickHouse automatically applies the materialized view's query to the new data block and writes the results to a separate target table. Over time, ClickHouse merges these partial results to produce a complete, up-to-date view.

) ENGINE = AggregatingMergeTree
PRIMARY KEY(name, span_kind);

CREATE MATERIALIZED VIEW IF NOT EXISTS operations_mv
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say anything about this approach without us doing stress testing, to see how it performs under heavy insert load. We can use OCI hardware for testing.

status_message String,
duration Int64,

events Nested (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without attributes, which are the core data element, it's hard to evaluate how good this schema is.

CREATE TABLE IF NOT EXISTS spans (
id String,
trace_id String,
trace_state String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the tracestate header? can we add comments? why is it so high at the top?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should add the link as comment

Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@mahadzaryab1 mahadzaryab1 changed the title [WIP] [clickhouse] Implement GetTraces for ClickHouse Storage [clickhouse] Implement GetTraces for ClickHouse Storage Jun 19, 2025
@mahadzaryab1 mahadzaryab1 marked this pull request as ready for review June 19, 2025 14:24
@mahadzaryab1 mahadzaryab1 requested a review from a team as a code owner June 19, 2025 14:24
@mahadzaryab1 mahadzaryab1 requested a review from albertteoh June 19, 2025 14:24
@mahadzaryab1 mahadzaryab1 requested a review from yurishkuro June 19, 2025 14:32
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Comment on lines +95 to +97
if len(parentSpanId) != 0 {
span.SetParentSpanID(pcommon.SpanID(parentSpanId))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for empty parent span ID currently only verifies string length. Consider also checking if it's a zero span ID (all zeros), as a span with a zero parent ID should be treated as having no parent. This would make the validation more robust against different representations of "no parent".

Suggested change
if len(parentSpanId) != 0 {
span.SetParentSpanID(pcommon.SpanID(parentSpanId))
}
if len(parentSpanId) != 0 && !isZeroSpanID(parentSpanId) {
span.SetParentSpanID(pcommon.SpanID(parentSpanId))
}
// isZeroSpanID checks if a span ID consists of all zeros
func isZeroSpanID(id []byte) bool {
for _, b := range id {
if b != 0 {
return false
}
}
return true
}

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

@mahadzaryab1 mahadzaryab1 added this pull request to the merge queue Jun 23, 2025
Merged via the queue into jaegertracing:main with commit b86d1fd Jun 23, 2025
60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage changelog:exprimental Change to an experimental part of the code v2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants