Skip to content

Conversation

Cali0707
Copy link
Member

Changes

  • Migrate eventing-rabbitmq to OTel
  • Update dependencies to get new required deps

Release Note

eventing-rabbitmq now uses OpenTelemetry for metrics and traces

Cali0707 added 3 commits July 23, 2025 15:52
Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 23, 2025
@knative-prow knative-prow bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 23, 2025
Signed-off-by: Calum Murray <cmurray@redhat.com>
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 23, 2025
@knative-prow knative-prow bot requested review from aliok, ikavgo and skonto July 23, 2025 20:02
Copy link

codecov bot commented Jul 23, 2025

Codecov Report

Attention: Patch coverage is 29.06404% with 144 lines in your changes missing coverage. Please review.

Project coverage is 57.71%. Comparing base (97cb765) to head (8866e45).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/utils/envconfig.go 0.00% 54 Missing ⚠️
cmd/ingress/main.go 0.00% 37 Missing ⚠️
cmd/dispatcher/main.go 0.00% 20 Missing ⚠️
pkg/dispatcher/dispatcher.go 76.00% 17 Missing and 1 partial ⚠️
pkg/reconciler/source/rabbitmqsource.go 0.00% 14 Missing ⚠️
pkg/reconciler/source/controller.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1590      +/-   ##
==========================================
- Coverage   58.99%   57.71%   -1.28%     
==========================================
  Files          60       58       -2     
  Lines        4341     4309      -32     
==========================================
- Hits         2561     2487      -74     
- Misses       1681     1733      +52     
+ Partials       99       89      -10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Calum Murray <cmurray@redhat.com>
@Cali0707
Copy link
Member Author

Signed-off-by: Calum Murray <cmurray@redhat.com>
@gauron99
Copy link
Contributor

/lgtm
/approve

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Jul 23, 2025
Copy link

knative-prow bot commented Jul 23, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Cali0707, gauron99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gauron99
Copy link
Contributor

/override "codecov/project"

Copy link

knative-prow bot commented Jul 23, 2025

@gauron99: Overrode contexts on behalf of gauron99: codecov/project

In response to this:

/override "codecov/project"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@knative-prow knative-prow bot merged commit 085a870 into knative-extensions:main Jul 23, 2025
21 of 22 checks passed
Copy link
Contributor

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, otherwise looks good.

/approve

@@ -196,12 +229,6 @@ func (d *Dispatcher) dispatch(ctx context.Context, msg amqp.Delivery, ceClient c

response, result := ceClient.Request(ctx, *event)
statusCode, isSuccess := getStatus(ctx, result)
if statusCode != -1 {
args := &dispatcher.ReportArgs{EventType: event.Type()}
if err = d.Reporter.ReportEventCount(args, statusCode); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we stop reporting counts of errors? That feels like a useful metric to keep.

It's also possible the defer is getting the http error somehow -- if so, that's probably a detail worth commenting, since earlier code is depending on the actions of later code, rather than the usual.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't measuring the count of errors before, but rather the count of events (it is a little confusing with how it was written).

Basically, event count = total count from the event dispatch duration is how we have handled the migration.

Regarding the errors/the status code, looking at it now I agree that we should probably have kept that, but since we also dropped it in eventing I want to fix it there first, then add that to all the downstream repos of eventing. I've captured that here: knative/eventing#8649

return nil
}

func (e *EnvConfig) ShutdownObservability(ctx context.Context) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like having this encapsulated in a function, rather than raw in the main file. Thanks!

@ikavgo
Copy link
Contributor

ikavgo commented Jul 23, 2025

should latencyBounds be configurable?

I think bot merged before @evankanderson's comments were addressed.

Copy link
Contributor

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I got here late.

I think latencyBounds are actually latencyBuckets, and it's reasonable for them to be fixed at compile time -- when multiple servers report using different bucket sizes, it's very hard to produce useful histograms later.

@ikavgo
Copy link
Contributor

ikavgo commented Jul 23, 2025

If they are hardcoded why we're sure multiple servers have them the same?

@Cali0707
Copy link
Member Author

If they are hardcoded why we're sure multiple servers have them the same?

In OTel, it is normally done by convention - we have been following the semantic convention for http request duration for the event dispatch duration metric (see here)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants