Skip to content

Conversation

giortzisg
Copy link
Contributor

@giortzisg giortzisg commented Jun 25, 2025

Partial fix for #1047.

Description

  • Fixing a race condition on GetSpan by adding a mutex.
  • Guarding transport so that you can only close the done channel once, to avoid panics.

@giortzisg giortzisg requested a review from cleptric June 25, 2025 10:20
@giortzisg giortzisg self-assigned this Jun 25, 2025
Copy link

codecov bot commented Jun 25, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.75%. Comparing base (d20dadd) to head (6308088).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1044      +/-   ##
==========================================
+ Coverage   86.70%   86.75%   +0.04%     
==========================================
  Files          55       55              
  Lines        5799     5804       +5     
==========================================
+ Hits         5028     5035       +7     
+ Misses        629      627       -2     
  Partials      142      142              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cleptric
Copy link
Member

@sentry review

Copy link

On it! We are reviewing the PR and will provide feedback shortly.

Copy link

PR Description

This pull request addresses two main issues: thread safety in accessing the span field within the Scope struct and preventing potential panics caused by closing the done channel multiple times in the HTTPTransport struct. The goal is to improve the reliability and stability of the Sentry Go SDK.

Click to see more

Key Technical Changes

  1. Scope.GetSpan() Mutex Protection: The GetSpan() method in scope.go now uses a read lock (sync.RWMutex) to protect access to the span field. This ensures thread-safe access when multiple goroutines might be reading the span concurrently.
  2. HTTPTransport.Close() Safe Channel Closing: The Close() method in transport.go now uses sync.Once to ensure that the done channel is closed only once. This prevents panics that could occur if Close() was called multiple times, potentially from different goroutines.
  3. HTTPTransport Close Test: A new test case, TestHTTPTransport_CloseMultipleTimes, is added to transport_test.go to verify that calling Close() multiple times on an HTTPTransport instance does not result in a panic.

Architecture Decisions

The architectural decisions involve using standard Go concurrency primitives (sync.RWMutex and sync.Once) to address the identified issues. This approach avoids introducing external dependencies and aligns with the existing concurrency patterns within the SDK.

Dependencies and Interactions

These changes primarily affect the internal workings of the Scope and HTTPTransport structs. The Scope changes impact any code that concurrently accesses the span field. The HTTPTransport changes affect the shutdown process of the transport, ensuring that the done channel is closed safely. There are no new external dependencies introduced.

Risk Considerations

The addition of mutexes in Scope.GetSpan() could potentially introduce performance overhead, although it is expected to be minimal due to the read-heavy nature of the operation. The HTTPTransport changes are low-risk, as they primarily address a potential panic scenario and do not alter the core event sending logic. Reviewers should pay special attention to the locking strategy in Scope.GetSpan() to ensure it doesn't introduce unintended contention.

Notable Implementation Details

The use of sync.Once in HTTPTransport.Close() is a standard pattern for ensuring that a function is executed only once, even when called concurrently. The test case TestHTTPTransport_CloseMultipleTimes provides explicit verification of the fix for the double-close issue.

case <-transport.done:
// Expected - channel should be closed
case <-time.After(time.Second):
t.Fatal("tranposrt.done not closed")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
t.Fatal("tranposrt.done not closed")
t.Fatal("transport.done not closed")

transport.go Outdated
@@ -303,7 +303,8 @@ type HTTPTransport struct {
// current in-flight items and starts a new batch for subsequent events.
buffer chan batch

start sync.Once
start sync.Once
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l: is naming things <something>Once a convention in go? Then maybe we could rename this to startOnce also for consistency between the two (could also be done in a separate PR though)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kinda is. Makes sense will change for consistency.

@giortzisg giortzisg merged commit 221b2ac into master Jun 30, 2025
18 checks passed
@giortzisg giortzisg deleted the race-cond branch June 30, 2025 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants