Skip to content

Conversation

Udit-takkar
Copy link
Contributor

@Udit-takkar Udit-takkar commented Jul 1, 2025

feat: Add distributed tracing for booking flow correlation

Part of #22969

Follow up:-

  • Add more logs and replace all logger with distributedTracer

Screenshot 2025-08-28 at 8 28 50 PM

Summary

This PR implements distributed tracing for Cal.com's booking flow to improve observability and debugging capabilities beyond the current per-request ID system. The implementation adds trace context propagation across multiple booking operations including calendar events, payment processing, webhook execution, and scheduled reminders.

Key Changes:

  • New tracing library (packages/lib/tracing.ts) - Centralized trace context management with hierarchical spans
  • Enhanced booking creation - Trace context initiation in handleNewBooking.ts
  • Cross-component correlation - Trace propagation through EventManager, webhook scheduling, payment processing
  • Comprehensive documentation - Debugging examples and Axiom queries demonstrating concrete benefits

Benefits over current request ID system:

  • Cross-Request Correlation: Links operations spanning multiple API requests (booking → webhook → cron execution)
  • Async Operation Tracking: Correlates scheduled operations back to original booking context
  • Performance Monitoring: Enables end-to-end timing analysis across entire booking lifecycle
  • Enhanced Debugging: Complete booking flow visibility in Axiom with correlation queries

Review & Testing Checklist for Human

  • End-to-end booking flow testing - Create bookings with different configurations and verify they complete successfully without errors
  • Trace correlation verification - Check Axiom logs to confirm trace IDs properly correlate operations across booking → calendar → webhook → payment flows
  • Async operation tracking - Verify webhook scheduling and execution maintains trace context correlation back to original booking
  • Performance impact assessment - Monitor booking flow performance to ensure tracing overhead is minimal
  • Edge case testing - Test scenarios where trace context might be missing (direct API calls, cron jobs) to ensure graceful fallback

Recommended test plan:

  1. Create standard booking and verify trace appears in Axiom with correlated operations
  2. Test payment flow and confirm payment success correlates to original booking trace
  3. Verify webhook execution traces back to originating booking
  4. Test seated bookings with multiple attendees for reminder correlation
  5. Run Axiom queries from examples to confirm they return expected results

Diagram

%%{ init : { "theme" : "default" }}%%
graph TD
    %% Main booking flow
    handleNewBooking["packages/features/bookings/lib/handleNewBooking.ts"]:::major-edit
    tracing["packages/lib/tracing.ts"]:::major-edit
    EventManager["packages/lib/EventManager.ts"]:::major-edit
    
    %% Webhook flow
    scheduleTrigger["packages/features/webhooks/lib/scheduleTrigger.ts"]:::major-edit
    handleWebhookScheduledTriggers["packages/features/webhooks/lib/handleWebhookScheduledTriggers.ts"]:::major-edit
    handleWebhookTrigger["packages/features/bookings/lib/handleWebhookTrigger.ts"]:::minor-edit
    
    %% Payment flow
    handlePaymentSuccess["packages/lib/payment/handlePaymentSuccess.ts"]:::minor-edit
    
    %% Seat booking flow
    handleSeats["packages/features/bookings/lib/handleSeats/handleSeats.ts"]:::major-edit
    scheduleNoShowTriggers["packages/features/bookings/lib/handleNewBooking/scheduleNoShowTriggers.ts"]:::minor-edit
    
    %% Reminder flow
    scheduleMandatoryReminder["packages/features/ee/workflows/lib/reminders/scheduleMandatoryReminder.ts"]:::minor-edit
    reminderScheduler["packages/features/ee/workflows/lib/reminders/reminderScheduler.ts"]:::minor-edit
    
    %% Documentation
    debuggingExamples["examples/distributed-tracing-debugging.md"]:::major-edit
    axiomQueries["examples/axiom-tracing-queries.md"]:::major-edit
    
    %% Flow connections
    handleNewBooking --> tracing
    handleNewBooking --> EventManager
    handleNewBooking --> scheduleTrigger
    handleNewBooking --> handleSeats
    
    scheduleTrigger --> handleWebhookScheduledTriggers
    handleWebhookScheduledTriggers --> handleWebhookTrigger
    
    handleNewBooking --> handlePaymentSuccess
    
    handleSeats --> scheduleMandatoryReminder
    handleNewBooking --> scheduleNoShowTriggers
    scheduleMandatoryReminder --> reminderScheduler
    
    %% Legend
    subgraph Legend
        L1["Major Edit"]:::major-edit
        L2["Minor Edit"]:::minor-edit
        L3["Context/No Edit"]:::context
    end
    
    %% Styling
    classDef major-edit fill:#90EE90
    classDef minor-edit fill:#87CEEB
    classDef context fill:#FFFFFF
Loading

Notes

Implementation approach:

  • Trace context is optional throughout - existing functionality continues to work without tracing
  • Uses existing logger infrastructure with enhanced prefixes for correlation
  • Backward compatible with current request ID logging
  • Follows existing patterns for middleware and error handling

Axiom integration:

  • Leverages existing Axiom setup with enhanced log prefixes
  • Provides ready-to-use correlation queries for common debugging scenarios
  • Enables proactive monitoring with alerting capabilities

Link to Devin run: https://app.devin.ai/sessions/d494fd7f104f4339a390e492f94bd47a
Requested by: @Udit-takkar

⚠️ Important: This implementation could not be tested end-to-end locally due to booking system complexity. Human testing of the complete booking flow is essential before merging.

- Create centralized tracing library with trace context management
- Integrate tracing into core booking flow components
- Add trace context propagation to webhook scheduling and execution
- Enhance payment processing with trace correlation
- Add comprehensive debugging examples and Axiom queries
- Maintain backward compatibility with existing logging

Co-Authored-By: udit@cal.com <udit222001@gmail.com>
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR that start with 'DevinAI'.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link

vercel bot commented Jul 1, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
cal Ignored Ignored Preview Aug 28, 2025 6:17pm
cal-eu Ignored Ignored Preview Aug 28, 2025 6:17pm

@keithwillcode keithwillcode added core area: core, team members only enterprise area: enterprise, audit log, organisation, SAML, SSO labels Jul 1, 2025
Copy link

delve-auditor bot commented Jul 1, 2025

No security or compliance issues detected. Reviewed everything up to 447ee7d.

Security Overview
  • 🔎 Scanned files: 34 changed file(s)
Detected Code Changes

The diff is too large to display a summary of code changes.

Reply to this PR with @delve-auditor followed by a description of what change you want and we'll auto-submit a change to this PR to implement it.

Copy link

graphite-app bot commented Jul 8, 2025

Graphite Automations

"Add consumer team as reviewer" took an action on this PR • (07/08/25)

1 reviewer was added to this PR based on Keith Williams's automation.

@dosubot dosubot bot added bookings area: bookings, availability, timezones, double booking ✨ feature New feature or request labels Jul 8, 2025
devin-ai-integration bot and others added 7 commits July 8, 2025 11:32
…g logger

- Enhance TraceContext to include event-specific context (eventTypeSlug, userInfo)
- Update DistributedTracing.getTracingLogger to include event context in prefix
- Replace all loggerWithEventDetails usages with tracingLogger in handleNewBooking.ts
- Add traceId to frontend error display for user support
- Maintain backward compatibility while improving observability

Co-Authored-By: udit@cal.com <udit222001@gmail.com>
…files

- Add trace context support to ensureAvailableUsers, validateBookingTimeIsNotOutOfBounds, validateEventLength
- Update seated booking reschedule functions to use distributed tracing
- Ensure all booking flow components support trace context propagation
- Maintain backward compatibility while enhancing observability

Co-Authored-By: udit@cal.com <udit222001@gmail.com>
- Update defaultResponder to extract and include traceId in error responses
- Add distributed tracing to api/book/event endpoint with trace context
- Include traceId in error data when booking API throws exceptions
- Enable frontend to display traceId as Reference ID for user support

Co-Authored-By: udit@cal.com <udit222001@gmail.com>
- Modified handleNewBooking to merge passed traceContext with additional properties instead of creating new one
- Fixed type error in defaultResponder.ts by adding proper type checking for traceId
- Ensures same traceId flows through entire booking process as requested by user

Co-Authored-By: udit@cal.com <udit222001@gmail.com>
- Add bookingUid and userId to getTracingLogger prefixes for better context
- Update createLoggerWithEventDetails to accept existing trace context parameter
- Allow reusing existing trace contexts instead of always creating new ones
- Maintain same traceId throughout booking flow while adding event-specific details

Co-Authored-By: udit@cal.com <udit222001@gmail.com>
Co-Authored-By: udit@cal.com <udit222001@gmail.com>
…tor approach

Co-Authored-By: udit@cal.com <udit222001@gmail.com>
@devin-ai-integration devin-ai-integration bot force-pushed the devin/distributed-tracing-booking-flow-1751401342 branch from ccb4c2b to b4639e2 Compare July 8, 2025 11:34
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cubic found 19 issues across 26 files. Review them in cubic.dev

React with 👍 or 👎 to teach cubic. Tag @cubic-dev-ai to give specific feedback.

…ipt error

Co-Authored-By: udit@cal.com <udit222001@gmail.com>
@CarinaWolli CarinaWolli modified the milestones: v5.6, v5.7 Aug 18, 2025
@@ -409,6 +409,7 @@ export type BookingHandlerInput = {
// These used to come from headers but now we're passing them as params
hostname?: string;
forcedSlug?: string;
traceContext?: TraceContext;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called by API v1, web app etc. so making it optional for now. In a follow up PR this would also become required after some testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bookings area: bookings, availability, timezones, double booking core area: core, team members only enterprise area: enterprise, audit log, organisation, SAML, SSO ✨ feature New feature or request platform Anything related to our platform plan ready-for-e2e
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants