-
Notifications
You must be signed in to change notification settings - Fork 345
[SVLS-7133] Instrument Azure Service Bus #6067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Overall package sizeSelf size: 9.63 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.7.0 | 35.02 MB | 35.02 MB | | @datadog/native-appsec | 10.0.0 | 20.3 MB | 20.31 MB | | @datadog/native-iast-taint-tracking | 4.0.0 | 11.72 MB | 11.73 MB | | @datadog/pprof | 5.9.0 | 9.77 MB | 10.14 MB | | @opentelemetry/core | 1.30.1 | 908.66 kB | 7.16 MB | | protobufjs | 7.5.3 | 2.95 MB | 5.6 MB | | @datadog/wasm-js-rewriter | 4.0.1 | 2.85 MB | 3.58 MB | | @datadog/native-metrics | 3.1.1 | 1.02 MB | 1.43 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | jsonpath-plus | 10.3.0 | 617.18 kB | 1.08 MB | | import-in-the-middle | 1.14.2 | 122.36 kB | 850.93 kB | | lru-cache | 10.4.3 | 804.3 kB | 804.3 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | limiter | 3.0.0 | 157.92 kB | 157.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.1 | 109.9 kB | 109.9 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 7.0.5 | 63.38 kB | 63.38 kB | | istanbul-lib-coverage | 3.2.2 | 34.37 kB | 34.37 kB | | rfdc | 1.4.1 | 27.15 kB | 27.15 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | dc-polyfill | 0.1.9 | 25.11 kB | 25.11 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | shell-quote | 1.8.3 | 23.74 kB | 23.74 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | semifies | 1.0.0 | 15.84 kB | 15.84 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | mutexify | 1.4.0 | 5.71 kB | 8.74 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.4 | 3.96 kB | 3.96 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6067 +/- ##
==========================================
- Coverage 81.86% 81.84% -0.02%
==========================================
Files 473 473
Lines 19444 19468 +24
==========================================
+ Hits 15917 15934 +17
- Misses 3527 3534 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
BenchmarksBenchmark execution time: 2025-07-11 19:33:53 Comparing candidate commit 483cef4 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 1176 metrics, 52 unstable metrics. |
| span.setTag('messaging.system', 'servicebus') | ||
| span.setTag('messaging.destination.name', triggerEntity) | ||
| span.setTag('resource.name', 'serviceBusQueueTrigger') | ||
| span.setTag('span.kind', 'consumer') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
span.kind can be imported from ext/tags as well
Line 8 in 47d3c63
| SPAN_KIND: 'span.kind', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing as your other comments. Other plugins are not using this file. Not sure why though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally prefer strings, especially for tests where I want the expectation to be as obvious as possible without having to search for the underlying value.
| const ServiceBusClient = obj.ServiceBusClient | ||
| shimmer.wrap(ServiceBusClient.prototype, 'createSender', createSender => function (queueName) { | ||
| const sender = createSender.apply(this, arguments) | ||
| shimmer.wrap(sender._sender, 'send', send => function (msg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When patching MessageSender.send are spans created in the following scenarios?
- Synchronous message send (no await)
ServiceBusSender.sendMessageswith a list of messages instead of a single messageServiceBusSender.scheduleMessages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Azure service bus does not support synchronous message sending
- A list did not work, because it was converted into a batch class and encoded
// Case 2: Batch message
batch = messages;
}
else {
// Case 3: Array of messages
batch = await this.createMessageBatch(options);
for (const message of messages) {
throwIfNotValidServiceBusMessage(message, errorInvalidMessageTypeSingleOrArray);
if (!batch.tryAddMessage(message, options)) {
- Did not try it, but based on the code I doubt it will work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Batching and scheduling is non-trivial. We should revisit this later. I'll make cards for us to track them.
Will re-review following passing tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few comments but no dealbreakers, just a few things that could be improved.
One thing that I noticed also is that these plugins only have integration tests. These are usually only used to test ESM support right now. Was that done because only ESM is supported? Otherwise the tests could be moved to regular plugin tests which would run a lot faster.
| const sender = createSender.apply(this, arguments) | ||
| shimmer.wrap(sender._sender, 'send', send => function (msg) { | ||
| const ctx = { sender, msg } | ||
| return producerStartCh.runStores(ctx, () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker, but for a standard promise instrumentation, it would have been possible to use tracingChannel.tracePromise() to replace this whole block.
| span.setTag('messaging.system', 'servicebus') | ||
| span.setTag('messaging.destination.name', triggerEntity) | ||
| span.setTag('resource.name', 'serviceBusQueueTrigger') | ||
| span.setTag('span.kind', 'consumer') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally prefer strings, especially for tests where I want the expectation to be as obvious as possible without having to search for the underlying value.
| @@ -35,11 +36,19 @@ class AzureFunctionsPlugin extends TracingPlugin { | |||
| 'aas.function.trigger': mapTriggerTag(methodName) | |||
| } | |||
| }, false) | |||
| if (triggerMap[methodName] === 'ServiceBus') { | |||
| const triggerEntity = invocationContext.options.trigger.queueName || invocationContext.options.trigger.topicName | |||
| span.setTag('messaging.message_id', invocationContext.triggerMetadata.messageId) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more efficient to prepare the meta object before startSpan depending on the trigger, and then pass that directly.
For example:
const meta = getMetaForTrigger(ctx)
const span = this.startSpan(this.operationName(), {
childOf,
service: this.serviceName(),
type: 'serverless',
meta
}, false)
function getMetaForTrigger({ functionName, methodName, invocationContext }) {
const meta = {
'aas.function.name': functionName,
'aas.function.trigger': mapTriggerTag(methodName)
}
if (triggerMap[methodName] === 'ServiceBus') {
...
}
return meta
}| @@ -35,11 +36,19 @@ class AzureFunctionsPlugin extends TracingPlugin { | |||
| 'aas.function.trigger': mapTriggerTag(methodName) | |||
| } | |||
| }, false) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR, but passing ctx here instead of false would remove the need to explicitly assign currentStore and parentStore to the context at the end of the function.
| @@ -75,4 +89,13 @@ function mapTriggerTag (methodName) { | |||
| return triggerMap[methodName] || 'Unknown' | |||
| } | |||
|
|
|||
| function extractTraceContext (tracer, ctx) { | |||
| switch (String(triggerMap[ctx.methodName])) { | |||
| case 'Http': | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of scope of this PR but in the future I think it might be worth it to split things as multiple plugins instead of having an ever-growing switch.
| sandbox = await createSandbox([ | ||
| `@azure/functions@${version}`, 'azure-functions-core-tools@4', '@azure/service-bus@7.9.2'], false, | ||
| ['./packages/datadog-plugin-azure-functions/test/integration-test/fixtures/*']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| sandbox = await createSandbox([ | |
| `@azure/functions@${version}`, 'azure-functions-core-tools@4', '@azure/service-bus@7.9.2'], false, | |
| ['./packages/datadog-plugin-azure-functions/test/integration-test/fixtures/*']) | |
| sandbox = await createSandbox([ | |
| `@azure/functions@${version}`, | |
| 'azure-functions-core-tools@4', | |
| '@azure/service-bus@7.9.2' | |
| ], | |
| false, | |
| ['./packages/datadog-plugin-azure-functions/test/integration-test/fixtures/*']) |
| assert.strictEqual(payload[0][0].parent_id, payload[1][1].span_id) | ||
| assert.strictEqual(payload[1][0].span_id, payload[1][1].parent_id) | ||
| }) | ||
| }).timeout(50000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 50 seconds? This seems very specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked @duncanpharvey about this before. He said it takes a while to start up the azure function runtime and 50 seconds gave enough cushion for the tests to pass reliably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I guess I was just curious why 50 seconds and not a minute at that point 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 50 second timeout was arbitrary
|
/merge |
|
View all feedbacks in Devflow UI.
Added to the queue but the mergequeue is not enabled for now.
jordan.storms@datadoghq.com unqueued this merge request |
|
/remove |
|
View all feedbacks in Devflow UI.
|
|
/merge |
|
View all feedbacks in Devflow UI.
Added to the queue but the mergequeue is not enabled for now.
jordan.storms@datadoghq.com unqueued this merge request since it was merged manually |
|
/remove |
|
View all feedbacks in Devflow UI.
|
* initial * plugin loading * full distributed trace * all span attributes set correctly * add consumer tests * update test * finalize service bus test * remove qpid from service bus test * instrument and test topics * remove localhost from connectionstring * lint * try removing localhost * revert localhost removal * try to use newer node versions * revert node versioning for integration tests * fix azure-functions test * fix sql server & lint * try actual name of sql service * clean up azure functions * formatting * lint
* initial * plugin loading * full distributed trace * all span attributes set correctly * add consumer tests * update test * finalize service bus test * remove qpid from service bus test * instrument and test topics * remove localhost from connectionstring * lint * try removing localhost * revert localhost removal * try to use newer node versions * revert node versioning for integration tests * fix azure-functions test * fix sql server & lint * try actual name of sql service * clean up azure functions * formatting * lint
* initial * plugin loading * full distributed trace * all span attributes set correctly * add consumer tests * update test * finalize service bus test * remove qpid from service bus test * instrument and test topics * remove localhost from connectionstring * lint * try removing localhost * revert localhost removal * try to use newer node versions * revert node versioning for integration tests * fix azure-functions test * fix sql server & lint * try actual name of sql service * clean up azure functions * formatting * lint
What does this PR do?
Adds a plugin for the @azure/service-bus
sendMessagesmethodAdds instrumentation for the Service Bus Queue trigger in the Azure Functions Plugin
Motivation
Customers expect full distributed tracing for Azure Functions. Unfortunately Azure function's output bindings did not allow us to inject context into the message. This led to the addition of the Azure Service Bus library.
Plugin Checklist
Additional Notes
Adds azureservicebusemulator and azuresqledge services to run Azure Service Bus locally and in CI for tests