Skip to content

Conversation

iypetrov
Copy link
Contributor

Which problem is this PR solving?

Description of the changes

  • A new sanitizer check has been introduced to replace negative durations with a default value of 1.
  • When a negative duration is corrected, additional metadata(warning, tag and) is added to the span to capture the adjustment + the event is logged.

How was this change tested?

  • A unit test has been added to verify that negative durations are correctly converted to 1, while positive durations remain unchanged.

Checklist

@iypetrov iypetrov requested a review from a team as a code owner May 14, 2025 14:43
@iypetrov iypetrov requested a review from mahadzaryab1 May 14, 2025 14:44
@iypetrov
Copy link
Contributor Author

this issue is quite old issue, so i was not able to implement the solution the exactly as it was suggested, but i tried to fulfil the requirements from the issue as well as from the comments

@yurishkuro yurishkuro requested a review from Copilot May 15, 2025 23:15
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a sanitizer to correct negative span durations by setting them to 1 and annotating the span with a warning and tag. It also registers this sanitizer in the span processor options and adds unit tests to verify both positive and negative duration cases.

  • Introduces NegativeDurationSpanSanitizer and wires it into BuildSpanProcessor
  • Appends warnings and tags for corrected negative durations
  • Adds TestNegativeDurationSpanSanitizer to cover zero, positive, and negative durations

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
cmd/collector/app/span_handler_builder.go Registered sanitizer option and implemented NegativeDurationSpanSanitizer method
cmd/collector/app/span_handler_builder_test.go Added unit tests for negative-duration sanitization behavior
Comments suppressed due to low confidence (1)

cmd/collector/app/span_handler_builder.go:94

  • [nitpick] Consider renaming this method to negativeDurationSpanSanitizer (lowercase) to keep it consistent with other unexported methods like metricsFactory and defaultSpanFilter.
func (b *SpanHandlerBuilder) NegativeDurationSpanSanitizer(span *model.Span) *model.Span {

@yurishkuro
Copy link
Member

We don't need to wire this directly into the processor, we have a composable framework for sanitizers.

@iypetrov iypetrov force-pushed the fix-esmapping-in-jaeger-span-json-the-duration-type-is-long-does-not-match-the-struct-field-type-uint64 branch from f48f8b7 to c71354f Compare May 16, 2025 05:27
@iypetrov
Copy link
Contributor Author

i am not sure if i get it correctly, but as i see there is already NewSpanDurationSanitizer in NewStandardSanitizers, isn't this resolving the problem?

@iypetrov iypetrov force-pushed the fix-esmapping-in-jaeger-span-json-the-duration-type-is-long-does-not-match-the-struct-field-type-uint64 branch from 74a1d73 to 731d9a8 Compare May 19, 2025 04:03
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to wire this directly into the processor, we have a composable framework for sanitizers.

@iypetrov iypetrov force-pushed the fix-esmapping-in-jaeger-span-json-the-duration-type-is-long-does-not-match-the-struct-field-type-uint64 branch 2 times, most recently from 9bf140d to ae7a2a1 Compare May 21, 2025 15:28
@yurishkuro
Copy link
Member

curiously, we had cmd/collector/app/sanitizer/zipkin/span_sanitizer.go which does this, but only for Zipkin format. I think the sanitizer is useful, but it does not address the referenced issue directly. It may help if the underlying cause for large numbers was negative duration, because the root cause of the issue seems to be the unsigned vs signed int representation in Jaeger's db struct vs ES data model. Arguably we have a bug in the ES code that should not be trying to use unsigned int and should take care of overflow before writing to ES.

So let's rename the PR. And please add another sanitizer for v2 data model ./cmd/jaeger/internal/sanitizer/

@iypetrov iypetrov force-pushed the fix-esmapping-in-jaeger-span-json-the-duration-type-is-long-does-not-match-the-struct-field-type-uint64 branch from b6c9468 to d5c6eb0 Compare May 22, 2025 13:11
@iypetrov
Copy link
Contributor Author

will add the sanitizer for v2 data model tomorrow

@iypetrov iypetrov force-pushed the fix-esmapping-in-jaeger-span-json-the-duration-type-is-long-does-not-match-the-struct-field-type-uint64 branch 2 times, most recently from d2ac8dd to e1a5398 Compare May 23, 2025 07:21
@iypetrov iypetrov changed the title [Fix] es mapping in jaeger-span.json the duration type is long, does … Add sanitizers for negative time duration May 23, 2025
@yurishkuro yurishkuro changed the title Add sanitizers for negative time duration Add sanitizers for negative span duration May 23, 2025
@iypetrov iypetrov force-pushed the fix-esmapping-in-jaeger-span-json-the-duration-type-is-long-does-not-match-the-struct-field-type-uint64 branch 2 times, most recently from 14efa52 to 590ba4c Compare May 26, 2025 20:36
@iypetrov iypetrov requested a review from yurishkuro May 26, 2025 20:36
@iypetrov iypetrov force-pushed the fix-esmapping-in-jaeger-span-json-the-duration-type-is-long-does-not-match-the-struct-field-type-uint64 branch from 590ba4c to e814f4d Compare May 26, 2025 20:37
Copy link

codecov bot commented May 26, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.20%. Comparing base (137dfcb) to head (b1df1bb).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7122      +/-   ##
==========================================
- Coverage   96.20%   96.20%   -0.01%     
==========================================
  Files         363      365       +2     
  Lines       21892    21940      +48     
==========================================
+ Hits        21061    21107      +46     
- Misses        621      623       +2     
  Partials      210      210              
Flag Coverage Δ
badger_v1 9.85% <0.00%> (-0.02%) ⬇️
badger_v2 1.90% <ø> (ø)
cassandra-4.x-v1-manual 14.82% <0.00%> (-0.03%) ⬇️
cassandra-4.x-v2-auto 1.89% <ø> (ø)
cassandra-4.x-v2-manual 1.89% <ø> (ø)
cassandra-5.x-v1-manual 14.82% <0.00%> (-0.03%) ⬇️
cassandra-5.x-v2-auto 1.89% <ø> (ø)
cassandra-5.x-v2-manual 1.89% <ø> (ø)
elasticsearch-6.x-v1 20.65% <0.00%> (-0.04%) ⬇️
elasticsearch-7.x-v1 20.73% <0.00%> (-0.04%) ⬇️
elasticsearch-8.x-v1 20.90% <0.00%> (-0.04%) ⬇️
elasticsearch-8.x-v2 1.90% <ø> (ø)
grpc_v1 11.39% <0.00%> (-0.02%) ⬇️
grpc_v2 1.90% <ø> (ø)
kafka-3.x-v1 10.20% <37.50%> (+0.06%) ⬆️
kafka-3.x-v2 1.90% <ø> (ø)
memory_v2 1.90% <ø> (ø)
opensearch-1.x-v1 20.78% <0.00%> (-0.04%) ⬇️
opensearch-2.x-v1 20.78% <0.00%> (-0.04%) ⬇️
opensearch-2.x-v2 1.90% <ø> (ø)
query 1.90% <ø> (ø)
tailsampling-processor 0.51% <ø> (ø)
unittests 95.02% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

iypetrov added 4 commits May 27, 2025 11:09
…not match the struct field type uint64 . (jaegertracing#2517)

Signed-off-by: Ilia Petrov <ilia.yavorov.petrov@gmail.com>
…r the one with new sanitizer

Signed-off-by: Ilia Petrov <ilia.yavorov.petrov@gmail.com>
Signed-off-by: Ilia Petrov <ilia.yavorov.petrov@gmail.com>
Signed-off-by: Ilia Petrov <ilia.yavorov.petrov@gmail.com>
iypetrov added 5 commits May 27, 2025 11:09
Signed-off-by: Ilia Petrov <ilia.yavorov.petrov@gmail.com>
Signed-off-by: Ilia Petrov <ilia.yavorov.petrov@gmail.com>
Signed-off-by: Ilia Petrov <ilia.yavorov.petrov@gmail.com>
Signed-off-by: Ilia Petrov <ilia.yavorov.petrov@gmail.com>
Signed-off-by: Ilia Petrov <ilia.yavorov.petrov@gmail.com>
@iypetrov iypetrov force-pushed the fix-esmapping-in-jaeger-span-json-the-duration-type-is-long-does-not-match-the-struct-field-type-uint64 branch from e40937a to b1df1bb Compare May 27, 2025 08:09
@yurishkuro yurishkuro enabled auto-merge May 27, 2025 13:04
@yurishkuro yurishkuro added this pull request to the merge queue May 27, 2025
Merged via the queue into jaegertracing:main with commit a7c26c5 May 27, 2025
59 of 60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

es mapping in "jaeger-span.json" the duration type is long, does not match the struct field type uint64 .
2 participants