Fix traceql exemplar distribution #5129

samuelarogbonlo · 2025-05-12T09:30:04Z

What this PR does:
Adds a new package exemplardist that implements an algorithm to fix the issue where TraceQL metrics exemplars cluster on one side of the visualization instead of being distributed homogeneously across the time range. This package provides a bucketing algorithm to evenly distribute exemplars across the time range while preserving their representative quality.

Which issue(s) this PR fixes:
Fixes #4856

Checklist

Tests updated - Added comprehensive tests for the exemplar distribution algorithm
Documentation added - Added README.md in the package with explanation and usage examples
CHANGELOG.md updated - Not updated since this is a standalone utility package

Additional Notes

This PR only adds the exemplardist package without modifying the core Tempo code. Direct integration into the core codebase was attempted but caused test failures due to the complex interactions with existing code.

The package provides a clean API that can be used at various integration points:

In API handlers that process TraceQL metrics results
In middleware that processes responses
In the frontend rendering code

This solution uses a bucketing algorithm that:

Divides the time range into equal buckets (number of buckets = max exemplars)
Assigns exemplars to the appropriate bucket based on timestamp
Selects one exemplar from each bucket
Fills any empty buckets with exemplars from dense areas to maintain even distribution

The implementation is tested to work with various distributions (uniform, left-skewed, right-skewed, clustered) and correctly improves the distribution quality in all cases.

…ar distribution

CLAassistant · 2025-05-12T09:30:12Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

knylander-grafana

Thank you for adding a readme to your package!

ruslan-mikhailov · 2025-05-26T15:32:27Z

Thank you for your contribution and for taking the time to put this well-explained proposal together!

To simplify a bit, TraceQL Metrics are calculated independently in the queriers for each block, and then merged together in the query-frontend. One of the challenges with exemplar distribution over time is ensuring a fair distribution at both of these stages.

While we do have bucket sampling, it doesn’t work particularly well when it comes to distribution fairness. To address the issue, we needed to change the bucketing algorithm and the approach to calculating requested exemplars from frontend to determine its fair share based on the block's time range.

I’ve implemented this approach here: #5158. It avoids additional sampling steps and has linear complexity.

Feel free to take a look and leave any feedback or questions on the PR. Thanks again for your interest and ideas!

ruslan-mikhailov · 2025-06-02T09:39:43Z

The issue is fixed in #5158

samuelarogbonlo added 2 commits May 12, 2025 10:15

Add exemplardist package with algorithm to fix TraceQL metrics exempl…

db4ab91

…ar distribution

Add README.md for exemplardist package

bb4d52e

samuelarogbonlo requested review from joe-elliott, mdisibio, mapno, yvrhdn, zalegrala, electron0zero, ie-pham, stoewer, javiermolinar and carles-grafana as code owners May 12, 2025 09:30

knylander-grafana reviewed May 23, 2025

View reviewed changes

knylander-grafana mentioned this pull request May 23, 2025

Exemplars UX improvements #5158

Merged

3 tasks

ruslan-mikhailov closed this Jun 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix traceql exemplar distribution #5129

Fix traceql exemplar distribution #5129

Uh oh!

samuelarogbonlo commented May 12, 2025

Uh oh!

CLAassistant commented May 12, 2025

Uh oh!

knylander-grafana left a comment

Uh oh!

ruslan-mikhailov commented May 26, 2025

Uh oh!

ruslan-mikhailov commented Jun 2, 2025

Uh oh!

Uh oh!

Fix traceql exemplar distribution #5129

Fix traceql exemplar distribution #5129

Uh oh!

Conversation

samuelarogbonlo commented May 12, 2025

Additional Notes

Uh oh!

CLAassistant commented May 12, 2025

Uh oh!

knylander-grafana left a comment

Choose a reason for hiding this comment

Uh oh!

ruslan-mikhailov commented May 26, 2025

Uh oh!

ruslan-mikhailov commented Jun 2, 2025

Uh oh!

Uh oh!