-
Notifications
You must be signed in to change notification settings - Fork 70
Migrate to OTel #1590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to OTel #1590
Conversation
Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1590 +/- ##
==========================================
- Coverage 58.99% 57.71% -1.28%
==========================================
Files 60 58 -2
Lines 4341 4309 -32
==========================================
- Hits 2561 2487 -74
- Misses 1681 1733 +52
+ Partials 99 89 -10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Cali0707, gauron99 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/override "codecov/project" |
@gauron99: Overrode contexts on behalf of gauron99: codecov/project In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
085a870
into
knative-extensions:main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question, otherwise looks good.
/approve
@@ -196,12 +229,6 @@ func (d *Dispatcher) dispatch(ctx context.Context, msg amqp.Delivery, ceClient c | |||
|
|||
response, result := ceClient.Request(ctx, *event) | |||
statusCode, isSuccess := getStatus(ctx, result) | |||
if statusCode != -1 { | |||
args := &dispatcher.ReportArgs{EventType: event.Type()} | |||
if err = d.Reporter.ReportEventCount(args, statusCode); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we stop reporting counts of errors? That feels like a useful metric to keep.
It's also possible the defer is getting the http error somehow -- if so, that's probably a detail worth commenting, since earlier code is depending on the actions of later code, rather than the usual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wasn't measuring the count of errors before, but rather the count of events (it is a little confusing with how it was written).
Basically, event count = total count from the event dispatch duration is how we have handled the migration.
Regarding the errors/the status code, looking at it now I agree that we should probably have kept that, but since we also dropped it in eventing I want to fix it there first, then add that to all the downstream repos of eventing. I've captured that here: knative/eventing#8649
return nil | ||
} | ||
|
||
func (e *EnvConfig) ShutdownObservability(ctx context.Context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like having this encapsulated in a function, rather than raw in the main file. Thanks!
should I think bot merged before @evankanderson's comments were addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I got here late.
I think latencyBounds
are actually latencyBuckets
, and it's reasonable for them to be fixed at compile time -- when multiple servers report using different bucket sizes, it's very hard to produce useful histograms later.
If they are hardcoded why we're sure multiple servers have them the same? |
In OTel, it is normally done by convention - we have been following the semantic convention for http request duration for the event dispatch duration metric (see here) |
Changes
Release Note