Skip to content

Conversation

pipiland2612
Copy link
Contributor

@pipiland2612 pipiland2612 commented Jun 29, 2025

Which problem is this PR solving?

🛑 Breaking change

  • ES v2 implementation will by default materialize / elevate tags span.kind and span.status (or error) to top level fields in the ES document.
  • If these tags were previously stored as nested (the default behavior) they will still be queryable

How was this change tested?

  • make lint test

Checklist

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
@pipiland2612 pipiland2612 requested a review from a team as a code owner June 29, 2025 17:36
@pipiland2612 pipiland2612 requested a review from joe-elliott June 29, 2025 17:36
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
@pipiland2612 pipiland2612 changed the title Materilize spankind and status [es] Materialize span.kind and span.status tags Jun 29, 2025
Copy link

codecov bot commented Jun 29, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.18%. Comparing base (0202d3e) to head (fa92a52).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7272      +/-   ##
==========================================
- Coverage   96.19%   96.18%   -0.02%     
==========================================
  Files         372      372              
  Lines       22333    22346      +13     
==========================================
+ Hits        21483    21493      +10     
- Misses        636      638       +2     
- Partials      214      215       +1     
Flag Coverage Δ
badger_v1 9.81% <ø> (ø)
badger_v2 1.82% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v1-manual 12.97% <ø> (ø)
cassandra-4.x-v2-auto 1.81% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v2-manual 1.81% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v1-manual 12.97% <ø> (ø)
cassandra-5.x-v2-auto 1.81% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v2-manual 1.81% <0.00%> (-0.01%) ⬇️
elasticsearch-6.x-v1 20.69% <ø> (ø)
elasticsearch-7.x-v1 20.74% <ø> (ø)
elasticsearch-8.x-v1 20.92% <ø> (ø)
elasticsearch-8.x-v2 1.82% <0.00%> (-0.01%) ⬇️
grpc_v1 11.34% <ø> (ø)
grpc_v2 1.82% <0.00%> (-0.01%) ⬇️
kafka-3.x-v1 10.16% <ø> (ø)
kafka-3.x-v2 1.82% <0.00%> (-0.01%) ⬇️
memory_v2 1.82% <0.00%> (-0.01%) ⬇️
opensearch-1.x-v1 20.79% <ø> (ø)
opensearch-2.x-v1 20.79% <ø> (ø)
opensearch-2.x-v2 1.82% <0.00%> (-0.01%) ⬇️
query 1.82% <0.00%> (-0.01%) ⬇️
tailsampling-processor 0.50% <0.00%> (-0.01%) ⬇️
unittests 95.03% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
@pipiland2612
Copy link
Contributor Author

Hi @yurishkuro, the pr is ready for your review

@yurishkuro yurishkuro added the changelog:breaking-change Change that is breaking public APIs or established behavior label Jun 29, 2025
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
@pipiland2612 pipiland2612 requested a review from yurishkuro June 29, 2025 21:46
pipiland2612 and others added 2 commits June 30, 2025 01:03
Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, but in this form there is no way to make this change backwards compatible. My original proposal for the feature was:

  • ESv2 will always materialize the tags, regardless of the feature setting
  • But people who already have data should be able to use the feature to enable reading the old data as either materialize or nested tags

This way new users can go with only materialized implementation, but existing users can enable dual reading for a certain time period until the data written in the old format times out.

@pipiland2612 pipiland2612 requested a review from yurishkuro June 30, 2025 07:56
@pipiland2612
Copy link
Contributor Author

Hi @yurishkuro, I have include the feature gate into reader.go and enable dualLookup for backwards compatible at 6ec473b and 7e33d9e

@pipiland2612
Copy link
Contributor Author

pipiland2612 commented Jun 30, 2025

Hi @yurishkuro, I think that the dualLookUp feature gate is not needed at all. Specifically in buildTagQuery function in this part of code:

func (s *SpanReader) buildFindTraceIDsQuery(traceQuery dbmodel.TraceQueryParameters) elastic.Query {
        ....
	for k, v := range traceQuery.Tags {
		tagQuery := s.buildTagQuery(k, v)
		boolQuery.Must(tagQuery)
	}
}

which is defines as

func (s *SpanReader) buildTagQuery(k string, v string) elastic.Query {
	objectTagListLen := len(objectTagFieldList)
	queries := make([]elastic.Query, len(nestedTagFieldList)+objectTagListLen)
	kd := s.dotReplacer.ReplaceDot(k)
	for i := range objectTagFieldList {
		queries[i] = s.buildObjectQuery(objectTagFieldList[i], kd, v)
	}
	for i := range nestedTagFieldList {
		queries[i+objectTagListLen] = s.buildNestedQuery(nestedTagFieldList[i], k, v)
	}

	// but configuration can change over time
	return elastic.NewBoolQuery().Should(queries...)
}

in this method you can actually see that it build query over objectTagFieldList and nestedTagFieldList which are defined correspondingly as followed:

objectTagFieldList = []string{"tag", "process.tag"}
nestedTagFieldList = []string{"tags", "process.tags", "logs.fields"}

So that must means this method always dualLookUp, so we actually don't need the feature anymore?

@yurishkuro
Copy link
Member

hm, yes it looks like we're always dual-querying.

@yurishkuro
Copy link
Member

I thought we have a flag that enables outgoing JSON logging in ES storage, this would be a good way to confirm the full query.

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
@pipiland2612
Copy link
Contributor Author

pipiland2612 commented Jul 1, 2025

Hi @yurishkuro, I added this logging code here:

	src, _ := boolQuery.Source()
	queryJSON, _ := json.Marshal(src)
	s.logger.Debug("Elasticsearch bool query to find trace", zap.String("query", string(queryJSON)))

and this is what I get, I think this do confirms that we are always dual-querying, so no need for this feature any more:

{
  "log_source": "jaeger-1",
  "timestamp": "2025-07-01T06:31:09.447Z",
  "level": "debug",
  "caller": "spanstore/reader.go:626",
  "message": "Elasticsearch bool query to find trace",
  "details": {
    "resource": {
      "service.instance.id": "7a00a55b-5f94-4d3c-8c3e-bd4884a775dd",
      "service.name": "jaeger",
      "service.version": "v2.7.0"
    },
    "otelcol.component.id": "jaeger_storage",
    "otelcol.component.kind": "extension",
    "query": {
      "bool": {
        "must": [
          { "range": { "startTimeMillis": { "from": 1751347869439, "include_lower": true, "include_upper": true, "to": 1751351469439 }}},
          { "match": { "process.serviceName": { "query": "frontend" }}},
          { "bool": { "should": [
            { "bool": { "must": { "regexp": { "tag.span@kind": { "value": "server" } } } } },
            { "bool": { "must": { "regexp": { "process.tag.span@kind": { "value": "server" } } } } },
            { "nested": { "path": "tags", "query": { "bool": { "must": [ { "match": { "tags.key": { "query": "span.kind" } } }, { "regexp": { "tags.value": { "value": "server" } } } ] } } } },
            { "nested": { "path": "process.tags", "query": { "bool": { "must": [ { "match": { "process.tags.key": { "query": "span.kind" } } }, { "regexp": { "process.tags.value": { "value": "server" } } } ] } } } },
            { "nested": { "path": "logs.fields", "query": { "bool": { "must": [ { "match": { "logs.fields.key": { "query": "span.kind" } } }, { "regexp": { "logs.fields.value": { "value": "server" } } } ] } } } }
          ]}},
          { "bool": { "should": [
            { "bool": { "must": { "regexp": { "tag.error": { "value": "null" } } } } },
            { "bool": { "must": { "regexp": { "process.tag.error": { "value": "null" } } } } },
            { "nested": { "path": "tags", "query": { "bool": { "must": [ { "match": { "tags.key": { "query": "error" } } }, { "regexp": { "tags.value": { "value": "null" } } } ] } } } },
            { "nested": { "path": "process.tags", "query": { "bool": { "must": [ { "match": { "process.tags.key": { "query": "error" } } }, { "regexp": { "process.tags.value": { "value": "null" } } } ] } } } },
            { "nested": { "path": "logs.fields", "query": { "bool": { "must": [ { "match": { "logs.fields.key": { "query": "error" } } }, { "regexp": { "logs.fields.value": { "value": "null" } } } ] } } } }
          ]}},
          { "bool": { "should": [
            { "bool": { "must": { "regexp": { "tag.true": { "value": "true" } } } } },
            { "bool": { "must": { "regexp": { "process.tag.true": { "value": "true" } } } } },
            { "nested": { "path": "tags", "query": { "bool": { "must": [ { "match": { "tags.key": { "query": "true" } } }, { "regexp": { "tags.value": { "value": "true" } } } ] } } } },
            { "nested": { "path": "process.tags", "query": { "bool": { "must": [ { "match": { "process.tags.key": { "query": "true" } } }, { "regexp": { "process.tags.value": { "value": "true" } } } ] } } } },
            { "nested": { "path": "logs.fields", "query": { "bool": { "must": [ { "match": { "logs.fields.key": { "query": "true" } } }, { "regexp": { "logs.fields.value": { "value": "true" } } } ] } } } }
          ]}}
        ]
      }
    }
  }
}

Signed-off-by: pipiland2612 <nguyen.t.dang.minh@gmail.com>
@pipiland2612 pipiland2612 force-pushed the Materilize_spankind_and_status branch from ee610d3 to 277b8d7 Compare July 1, 2025 06:37
@pipiland2612
Copy link
Contributor Author

Base on this, I have removed the feature gate, only include the ensureRequiredTag method in v2

@yurishkuro yurishkuro added this pull request to the merge queue Jul 1, 2025
Merged via the queue into jaegertracing:main with commit eec6cd8 Jul 1, 2025
59 checks passed
@pipiland2612 pipiland2612 deleted the Materilize_spankind_and_status branch July 1, 2025 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage changelog:breaking-change Change that is breaking public APIs or established behavior storage/elasticsearch v2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants