-
Notifications
You must be signed in to change notification settings - Fork 11
OTLP: Insert created timestamp if starttimestamp is within interval #689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTLP: Insert created timestamp if starttimestamp is within interval #689
Conversation
20352ee
to
c8bc00e
Compare
bbe3b38
to
95c22e2
Compare
After a lot of investigation we understood a bit better OTel's spec handling of StartTimeUnixNano and TimeUnixNano for cumulative streams. This change inserts an extra zero to any incoming sample that has StartTimeUnixNano set within 2 minutes of TimeUnixNano. Also unit tests are included. Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
f8d7e96
to
9789b21
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable.
// TODO(jesus.vazquez) Temporary debug log to understand the behavior of start ts and ts. We're only tracking counters | ||
for _, v := range c.startTsInfoPerSeries { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why store the info? Why not log it out straight away when trackStartTimestampForSeries
is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, dropped this map and just logged right away.
everyN everyNTimes | ||
unique map[uint64]*prompb.TimeSeries | ||
conflicts map[uint64][]*prompb.TimeSeries | ||
startTsInfoPerSeries map[uint64]StartTsAndTs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if this is for debugging only, I would comment that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this map.
d39e1b5
to
3741aae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it looks good, except I think maybe a startTimestampNs
variable should be removed for consistency (see comment).
As mentioned before, I am a bit concerned though about potentially adding zero samples for many samples within the interval (validIntervalForStartTimestamps
). I.e., potentially many consecutive samples could have a start time delta <= validIntervalForStartTimestamps
.
storage/remote/otlptranslator/prometheusremotewrite/number_data_points.go
Outdated
Show resolved
Hide resolved
3741aae
to
b6ff474
Compare
This commit passes logger down to FromMetrics and extends the converter to printout some information about the incoming series timestamps and start timestamps. Only for sums. Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> format series Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com> dont store, log right away Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
b6ff474
to
0c21f74
Compare
Note: I'm not updating the changelog because the existing entry is enough.
|
Followup to grafana/mimir-prometheus#689 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* Vendor update mimir-prometheus at 53cf3fb8e7e3 * Add Close method to activity trackers Followup to prometheus/prometheus#14064 * Pass logger to otlp FromMetrics Followup to grafana/mimir-prometheus#689 * Pin google GRPC to 1.65 for the time being Possible performance regression: Ref: grafana/dskit#581 Also has breaking API changes for the experimental buffers. * Remove workaround in test Follow up to: prometheus/prometheus#14772 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Relates to https://opentelemetry.io/docs/specs/otel/metrics/data-model/#resets-and-gaps and open-telemetry/opentelemetry-specification#4184
While testing our previously introduced feature to introduce 0's at
ts-1
when an OTLP sample hadStartTimeUnixNano == sample.timestamp
we noticed that it was not working for some usages of the OTel SDK and connectors like the spanmetrics connector.Later we found out the reason is that it's unclear to what value
StartTImeUnixNano
should be set to when there is a reset in a cumulative stream. The links at the beginning give you more context on what the spec says and then a reasonable ask by Arthur asking to have a StartTimeUnixNano initialized per series and to the first observation.Unfortunately it will take time to get there.
So what this PR does is to slightly modify our existing workaround of adding those zeros. Instead of writing the zero if the sample had
StartTimeUnixNano == sample.timestamp
, now it will only insert the zero ifthe sample's StartTimeUnixNano is within 2 minutes of sample.Timestamp
The downside of this approach is that we can't know the scrape interval of samples so in the case of more than 1 sample sent per minute, this workaround could lead to duplicate zero's for the same timestamp until that minute has passed. This incurs an increase of duplicate data in memory and an increase of DPM per series.