Skip to content

Conversation

ezynda3
Copy link
Contributor

@ezynda3 ezynda3 commented Aug 5, 2025

Description

This PR fixes a critical race condition in the stdio transport that occurs when multiple concurrent tool calls are made. The issue causes "bad file descriptor" errors and permanent stdin corruption after 2-4 concurrent calls.

Fixes #528

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • Tests only (no functional changes)

Checklist

  • My code follows the code style of this project
  • I have performed a self-review of my own code
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the documentation accordingly

Additional Information

Root Cause

The stdio transport was spawning a new goroutine for each tools/call request to avoid blocking on sampling requests. However, multiple goroutines were accessing the same bufio.Reader on stdin, which is not thread-safe. This led to corrupted reads and permanent stdin failure.

Solution

Replaced the goroutine-per-request pattern with a worker pool architecture:

  • Fixed pool of worker goroutines (default: 5, configurable)
  • Buffered queue for tool calls (default: 100 capacity, configurable)
  • Mutex protection for concurrent writes
  • Graceful shutdown with proper cleanup

Testing

  • Added comprehensive test for concurrent tool calls
  • All existing tests pass
  • Race detector confirms no race conditions
  • Maintains full backward compatibility

🤖 Generated with opencode

Summary by CodeRabbit

  • New Features

    • Improved concurrent processing of "tools/call" requests with a configurable worker pool and queue limits.
    • Added options to customize worker thread count and queue size for tool call handling.
  • Bug Fixes

    • Enhanced output handling to prevent issues from simultaneous writes during concurrent processing.
  • Tests

    • Added tests verifying concurrent tool call handling, response correctness, and configuration option boundaries.

Fixes #528 by replacing goroutine-per-request pattern with a worker pool
to prevent multiple goroutines from accessing the non-thread-safe bufio.Reader.

- Add worker pool pattern with configurable size (default: 5 workers)
- Add buffered queue for tool calls (default: 100 capacity)
- Protect concurrent writes with mutex
- Maintain backward compatibility while fixing the race condition
- Add comprehensive test for concurrent tool calls

🤖 Generated with [opencode](https://opencode.ai)

Co-Authored-By: opencode <noreply@opencode.ai>
Copy link
Contributor

coderabbitai bot commented Aug 5, 2025

Walkthrough

A worker pool and bounded queue were introduced in the StdioServer to handle "tools/call" JSON-RPC messages concurrently, replacing unbounded goroutine spawning. Output writes are now serialized with a mutex for thread safety. Tests were added to verify correct concurrent tool call handling and response integrity.

Changes

Cohort / File(s) Change Summary
StdioServer concurrency and worker pool
server/stdio.go
Added a worker pool and bounded queue for "tools/call" message processing; introduced concurrency-safe output serialization with a mutex; new configuration options for pool and queue size; refactored message processing and server lifecycle for safe concurrent handling.
Concurrent tool call and configuration tests
server/stdio_test.go
Added tests to validate concurrent tool call handling, ensuring correct response count, error absence, and per-tool invocation under concurrent load; added tests for configuration option bounds on worker pool and queue sizes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Assessment against linked issues

Objective Addressed Explanation
Eliminate race condition from concurrent goroutine use on bufio.Reader (server/stdio.go) (#528)
Ensure only one goroutine reads from stdin at a time (#528)
Serialize output writes to prevent concurrent write races (#528)
Add tests to verify concurrent tool call handling and absence of race conditions (#528)

Assessment against linked issues: Out-of-scope changes

No out-of-scope changes detected.

Suggested reviewers

  • rwjblue-glean
  • pottekkat
  • dugenkui03

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0ee38c4 and f8ffa62.

📒 Files selected for processing (1)
  • server/stdio_test.go (2 hunks)
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: floatingIce91
PR: mark3labs/mcp-go#401
File: server/server.go:1082-1092
Timestamp: 2025-06-23T11:10:42.948Z
Learning: In Go MCP server, ServerTool.Tool field is only used for tool listing and indexing, not for tool execution or middleware. During handleToolCall, only the Handler field is used, so dynamic tools don't need the Tool field populated.
📚 Learning: in the mcptest package, prefer returning errors from helper functions rather than calling t.fatalf()...
Learnt from: octo
PR: mark3labs/mcp-go#149
File: mcptest/mcptest.go:0-0
Timestamp: 2025-04-21T21:26:32.945Z
Learning: In the mcptest package, prefer returning errors from helper functions rather than calling t.Fatalf() directly, giving callers flexibility in how to handle errors.

Applied to files:

  • server/stdio_test.go
📚 Learning: in go mcp server, servertool.tool field is only used for tool listing and indexing, not for tool exe...
Learnt from: floatingIce91
PR: mark3labs/mcp-go#401
File: server/server.go:1082-1092
Timestamp: 2025-06-23T11:10:42.948Z
Learning: In Go MCP server, ServerTool.Tool field is only used for tool listing and indexing, not for tool execution or middleware. During handleToolCall, only the Handler field is used, so dynamic tools don't need the Tool field populated.

Applied to files:

  • server/stdio_test.go
📚 Learning: in the mark3labs/mcp-go project, the mcpserver.capabilities field is a struct value (servercapabilit...
Learnt from: ezynda3
PR: mark3labs/mcp-go#461
File: server/sampling.go:22-26
Timestamp: 2025-06-30T07:13:17.052Z
Learning: In the mark3labs/mcp-go project, the MCPServer.capabilities field is a struct value (serverCapabilities), not a pointer, so it cannot be nil and doesn't require nil checking. Only pointer fields within the capabilities struct should be checked for nil.

Applied to files:

  • server/stdio_test.go
📚 Learning: the tool struct in the mark3labs/mcp-go project should handle both inputschema and rawinputschema co...
Learnt from: xinwo
PR: mark3labs/mcp-go#35
File: mcp/tools.go:0-0
Timestamp: 2025-03-04T07:00:57.111Z
Learning: The Tool struct in the mark3labs/mcp-go project should handle both InputSchema and RawInputSchema consistently between MarshalJSON and UnmarshalJSON methods, even though the tools response from MCP server typically doesn't contain rawInputSchema.

Applied to files:

  • server/stdio_test.go
📚 Learning: tool responses from the mcp server shouldn't contain rawinputschema, which is why the unmarshaljson ...
Learnt from: xinwo
PR: mark3labs/mcp-go#35
File: mcp/tools.go:107-137
Timestamp: 2025-03-04T06:59:43.882Z
Learning: Tool responses from the MCP server shouldn't contain RawInputSchema, which is why the UnmarshalJSON method for the Tool struct is implemented to handle only the structured InputSchema format.

Applied to files:

  • server/stdio_test.go
📚 Learning: the tool struct in mark3labs/mcp-go handles both inputschema and rawinputschema formats. when unmars...
Learnt from: xinwo
PR: mark3labs/mcp-go#35
File: mcp/tools.go:0-0
Timestamp: 2025-03-04T07:00:57.111Z
Learning: The Tool struct in mark3labs/mcp-go handles both InputSchema and RawInputSchema formats. When unmarshaling JSON, it first tries to parse into a structured ToolInputSchema format, and if that fails, it falls back to using the raw schema format, providing symmetry with the MarshalJSON method.

Applied to files:

  • server/stdio_test.go
📚 Learning: in the mcp-go library, the getpromptparams.arguments field is of type map[string]string, not map[str...
Learnt from: lariel-fernandes
PR: mark3labs/mcp-go#428
File: www/docs/pages/servers/prompts.mdx:218-234
Timestamp: 2025-06-20T20:39:51.870Z
Learning: In the mcp-go library, the GetPromptParams.Arguments field is of type map[string]string, not map[string]interface{}, so direct string access without type assertions is safe and correct.

Applied to files:

  • server/stdio_test.go
📚 Learning: in mcp-go project, the maintainer prefers keeping builder pattern apis simple without excessive vali...
Learnt from: davidleitw
PR: mark3labs/mcp-go#451
File: mcp/tools.go:1192-1217
Timestamp: 2025-06-26T09:38:18.629Z
Learning: In mcp-go project, the maintainer prefers keeping builder pattern APIs simple without excessive validation for edge cases. The WithOutput* functions are designed to assume correct usage rather than defensive programming, following the principle of API simplicity over comprehensive validation.

Applied to files:

  • server/stdio_test.go
🔇 Additional comments (8)
server/stdio_test.go (8)

7-7: LGTM! Import additions are justified.

The new imports (fmt, sync, time) are necessary for the concurrent tool call testing functionality.

Also applies to: 11-11, 13-13


274-303: LGTM! Well-designed concurrent test setup.

The test properly sets up concurrent tool execution with:

  • Thread-safe call tracking using sync.Map
  • Multiple tools to distribute load
  • Proper closure capture for tool names
  • Realistic work simulation

387-400: LGTM! Improved timeout and control flow.

The timeout has been appropriately reduced to 2 seconds and the control flow now uses idiomatic labeled break instead of goto.


406-422: LGTM! Comprehensive response validation.

The response validation properly:

  • Verifies all expected responses were received
  • Checks for errors in responses
  • Handles JSON unmarshaling with proper error checking
  • Validates response structure

424-432: LGTM! Proper validation of tool execution distribution.

The call count validation correctly uses sync.Map.Range to verify all tools were invoked during concurrent execution.


448-459: LGTM! Correct bounds testing for worker pool size.

The test now properly expects the maximum value (100) when exceeding bounds and correctly validates in-range values.


461-474: LGTM! Correct bounds testing for queue size.

The test now properly expects the maximum value (10000) when exceeding bounds and correctly validates in-range values.


475-485: LGTM! Proper edge case testing for invalid inputs.

The test correctly verifies that zero and negative values fall back to appropriate defaults (worker pool size 5, queue size 100).

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-stdio-race-condition

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

- Add error checking for Write operations in concurrent tool calls test
- Ensures all error return values are properly handled

🤖 Generated with [opencode](https://opencode.ai)

Co-Authored-By: opencode <noreply@opencode.ai>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
server/stdio.go (2)

68-83: Consider adding upper bounds validation

While the options check for positive values, consider adding upper bounds to prevent excessive resource usage.

 // WithWorkerPoolSize sets the number of workers for processing tool calls
 func WithWorkerPoolSize(size int) StdioOption {
 	return func(s *StdioServer) {
-		if size > 0 {
+		if size > 0 && size <= 100 {
 			s.workerPoolSize = size
+		} else if size > 100 {
+			s.errLogger.Printf("Worker pool size %d exceeds maximum (100), using default", size)
 		}
 	}
 }

 // WithQueueSize sets the size of the tool call queue
 func WithQueueSize(size int) StdioOption {
 	return func(s *StdioServer) {
-		if size > 0 {
+		if size > 0 && size <= 10000 {
 			s.queueSize = size
+		} else if size > 10000 {
+			s.errLogger.Printf("Queue size %d exceeds maximum (10000), using default", size)
 		}
 	}
 }

319-340: Consider adding context to write errors

The worker correctly processes tool calls but could provide more context when write errors occur.

 			// Process the tool call
 			response := s.server.HandleMessage(work.ctx, work.message)
 			if response != nil {
 				if err := s.writeResponse(response, work.writer); err != nil {
-					s.errLogger.Printf("Error writing tool response: %v", err)
+					// Extract method name for better error context
+					var req struct {
+						Method string `json:"method"`
+						ID     any    `json:"id"`
+					}
+					json.Unmarshal(work.message, &req)
+					s.errLogger.Printf("Error writing response for %s (id: %v): %v", req.Method, req.ID, err)
 				}
 			}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6da5cd1 and a6bb31e.

📒 Files selected for processing (2)
  • server/stdio.go (8 hunks)
  • server/stdio_test.go (2 hunks)
🧰 Additional context used
🧠 Learnings (8)
📓 Common learnings
Learnt from: floatingIce91
PR: mark3labs/mcp-go#401
File: server/server.go:1082-1092
Timestamp: 2025-06-23T11:10:42.948Z
Learning: In Go MCP server, ServerTool.Tool field is only used for tool listing and indexing, not for tool execution or middleware. During handleToolCall, only the Handler field is used, so dynamic tools don't need the Tool field populated.
📚 Learning: in go mcp server, servertool.tool field is only used for tool listing and indexing, not for tool exe...
Learnt from: floatingIce91
PR: mark3labs/mcp-go#401
File: server/server.go:1082-1092
Timestamp: 2025-06-23T11:10:42.948Z
Learning: In Go MCP server, ServerTool.Tool field is only used for tool listing and indexing, not for tool execution or middleware. During handleToolCall, only the Handler field is used, so dynamic tools don't need the Tool field populated.

Applied to files:

  • server/stdio_test.go
  • server/stdio.go
📚 Learning: in the mcptest package, prefer returning errors from helper functions rather than calling t.fatalf()...
Learnt from: octo
PR: mark3labs/mcp-go#149
File: mcptest/mcptest.go:0-0
Timestamp: 2025-04-21T21:26:32.945Z
Learning: In the mcptest package, prefer returning errors from helper functions rather than calling t.Fatalf() directly, giving callers flexibility in how to handle errors.

Applied to files:

  • server/stdio_test.go
📚 Learning: in the mark3labs/mcp-go project, the mcpserver.capabilities field is a struct value (servercapabilit...
Learnt from: ezynda3
PR: mark3labs/mcp-go#461
File: server/sampling.go:22-26
Timestamp: 2025-06-30T07:13:17.052Z
Learning: In the mark3labs/mcp-go project, the MCPServer.capabilities field is a struct value (serverCapabilities), not a pointer, so it cannot be nil and doesn't require nil checking. Only pointer fields within the capabilities struct should be checked for nil.

Applied to files:

  • server/stdio_test.go
  • server/stdio.go
📚 Learning: in mcp-go project, the maintainer prefers keeping builder pattern apis simple without excessive vali...
Learnt from: davidleitw
PR: mark3labs/mcp-go#451
File: mcp/tools.go:1192-1217
Timestamp: 2025-06-26T09:38:18.629Z
Learning: In mcp-go project, the maintainer prefers keeping builder pattern APIs simple without excessive validation for edge cases. The WithOutput* functions are designed to assume correct usage rather than defensive programming, following the principle of API simplicity over comprehensive validation.

Applied to files:

  • server/stdio_test.go
📚 Learning: in the mcp-go library, the getpromptparams.arguments field is of type map[string]string, not map[str...
Learnt from: lariel-fernandes
PR: mark3labs/mcp-go#428
File: www/docs/pages/servers/prompts.mdx:218-234
Timestamp: 2025-06-20T20:39:51.870Z
Learning: In the mcp-go library, the GetPromptParams.Arguments field is of type map[string]string, not map[string]interface{}, so direct string access without type assertions is safe and correct.

Applied to files:

  • server/stdio_test.go
📚 Learning: the tool struct in the mark3labs/mcp-go project should handle both inputschema and rawinputschema co...
Learnt from: xinwo
PR: mark3labs/mcp-go#35
File: mcp/tools.go:0-0
Timestamp: 2025-03-04T07:00:57.111Z
Learning: The Tool struct in the mark3labs/mcp-go project should handle both InputSchema and RawInputSchema consistently between MarshalJSON and UnmarshalJSON methods, even though the tools response from MCP server typically doesn't contain rawInputSchema.

Applied to files:

  • server/stdio_test.go
📚 Learning: tool responses from the mcp server shouldn't contain rawinputschema, which is why the unmarshaljson ...
Learnt from: xinwo
PR: mark3labs/mcp-go#35
File: mcp/tools.go:107-137
Timestamp: 2025-03-04T06:59:43.882Z
Learning: Tool responses from the MCP server shouldn't contain RawInputSchema, which is why the UnmarshalJSON method for the Tool struct is implemented to handle only the structured InputSchema format.

Applied to files:

  • server/stdio_test.go
  • server/stdio.go
🔇 Additional comments (5)
server/stdio_test.go (1)

280-297: Remove redundant synchronization

The sync.Map is already thread-safe for concurrent operations. The additional mutex is unnecessary and adds complexity.

-		// Track tool call executions
-		var callCount sync.Map
-		var mu sync.Mutex
+		// Track tool call executions (sync.Map is already thread-safe)
+		var callCount sync.Map

 		// Create server with test tools
 		mcpServer := NewMCPServer("test", "1.0.0")

 		// Add multiple tools that simulate work and track concurrent execution
 		for i := 0; i < 5; i++ {
 			toolName := fmt.Sprintf("test_tool_%d", i)
 			mcpServer.AddTool(
 				mcp.NewTool(toolName),
 				func(name string) func(ctx context.Context, request mcp.CallToolRequest) (*mcp.CallToolResult, error) {
 					return func(ctx context.Context, request mcp.CallToolRequest) (*mcp.CallToolResult, error) {
 						// Track concurrent executions
-						mu.Lock()
 						count, _ := callCount.LoadOrStore(name, 0)
 						callCount.Store(name, count.(int)+1)
-						mu.Unlock()
⛔ Skipped due to learnings
Learnt from: floatingIce91
PR: mark3labs/mcp-go#401
File: server/server.go:1082-1092
Timestamp: 2025-06-23T11:10:42.948Z
Learning: In Go MCP server, ServerTool.Tool field is only used for tool listing and indexing, not for tool execution or middleware. During handleToolCall, only the Handler field is used, so dynamic tools don't need the Tool field populated.
Learnt from: octo
PR: mark3labs/mcp-go#149
File: mcptest/mcptest.go:0-0
Timestamp: 2025-04-21T21:26:32.945Z
Learning: In the mcptest package, prefer returning errors from helper functions rather than calling t.Fatalf() directly, giving callers flexibility in how to handle errors.
Learnt from: ezynda3
PR: mark3labs/mcp-go#461
File: server/sampling.go:22-26
Timestamp: 2025-06-30T07:13:17.052Z
Learning: In the mark3labs/mcp-go project, the MCPServer.capabilities field is a struct value (serverCapabilities), not a pointer, so it cannot be nil and doesn't require nil checking. Only pointer fields within the capabilities struct should be checked for nil.
Learnt from: tylergannon
PR: mark3labs/mcp-go#62
File: server/callbacks.go:48-70
Timestamp: 2025-03-21T06:11:18.927Z
Learning: The callbacks in MCPServer are designed to be configured only during initialization via the WithCallbacks option and not modified during operation, which eliminates the need for concurrency protection.
Learnt from: xinwo
PR: mark3labs/mcp-go#35
File: mcp/tools.go:107-137
Timestamp: 2025-03-04T06:59:43.882Z
Learning: Tool responses from the MCP server shouldn't contain RawInputSchema, which is why the UnmarshalJSON method for the Tool struct is implemented to handle only the structured InputSchema format.
server/stdio.go (4)

33-39: LGTM! Well-structured concurrency control

The addition of worker pool, bounded queue, and write mutex properly addresses the race condition issue. The design with configurable pool and queue sizes provides good flexibility.


376-412: Excellent implementation of graceful shutdown

The initialization of the queue, worker pool startup, and graceful shutdown sequence properly ensures all pending work is completed before exit. This correctly addresses the race condition issue.


549-558: Write synchronization properly implemented

The mutex protection for concurrent writes correctly prevents the race condition that was causing "bad file descriptor" errors.


456-457: Use error-level logging for queue overflow

Queue overflow is an important operational issue that should be logged at error level, not just Printf.

 		default:
 			// Queue is full, process synchronously as fallback
-			s.errLogger.Printf("Tool call queue full, processing synchronously")
+			s.errLogger.Printf("ERROR: Tool call queue full (capacity: %d), falling back to synchronous processing", s.queueSize)

Likely an incorrect or invalid review comment.

ezynda3 and others added 2 commits August 5, 2025 05:57
sync.Map is already thread-safe and doesn't require additional mutex protection.
This simplifies the test code while maintaining the same functionality.

🤖 Generated with [opencode](https://opencode.ai)

Co-Authored-By: opencode <noreply@opencode.ai>
…dation

- Replace goto statements with labeled break for more idiomatic Go
- Reduce test timeout from 5s to 2s for faster test execution
- Add upper bounds validation for worker pool size (max 100) and queue size (max 10000)
- Add tests to verify configuration bounds are respected
- Log warnings when configuration values exceed maximum limits

Addresses remaining CodeRabbit review feedback.

🤖 Generated with [opencode](https://opencode.ai)

Co-Authored-By: opencode <noreply@opencode.ai>
s.workerPoolSize = size
} else if size > 100 {
s.errLogger.Printf("Worker pool size %d exceeds maximum (100), using default: %d", size, s.workerPoolSize)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the size exceeds the maximum limit, falling back to the default value (100) is not appropriate, as the user clearly expects a larger queue size. maximum seems to be a better choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: 0ee38c4

s.queueSize = size
} else if size > 10000 {
s.errLogger.Printf("Queue size %d exceeds maximum (10000), using default: %d", size, s.queueSize)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using maximum value would be better (instead of default value)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: 0ee38c4

When WithWorkerPoolSize or WithQueueSize receive values exceeding their
maximum limits, use the maximum allowed value instead of falling back
to the default. This better aligns with user expectations when they
request large pool/queue sizes.

🤖 Generated with [opencode](https://opencode.ai)

Co-Authored-By: opencode <noreply@opencode.ai>
Copy link
Contributor Author

@ezynda3 ezynda3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: 0ee38c4

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
server/stdio_test.go (2)

337-344: Fix error handling in initialization.

The code ignores errors from JSON marshaling and doesn't properly handle scanner failures during initialization.

Apply this diff to add proper error handling:

-		requestBytes, _ := json.Marshal(initRequest)
-		if _, err := stdinWriter.Write(append(requestBytes, '\n')); err != nil {
-			t.Fatalf("Failed to write init request: %v", err)
-		}
+		requestBytes, err := json.Marshal(initRequest)
+		if err != nil {
+			t.Fatalf("Failed to marshal init request: %v", err)
+		}
+		if _, err := stdinWriter.Write(append(requestBytes, '\n')); err != nil {
+			t.Fatalf("Failed to write init request: %v", err)
+		}

 		// Read init response
 		scanner := bufio.NewScanner(stdoutReader)
-		scanner.Scan()
+		if !scanner.Scan() {
+			t.Fatal("Failed to read init response")
+		}

365-368: Add error handling in tool request goroutine.

The goroutine sending tool requests ignores JSON marshaling errors, which could lead to silent test failures.

Apply this diff to add proper error handling:

-				requestBytes, _ := json.Marshal(toolRequest)
-				if _, err := stdinWriter.Write(append(requestBytes, '\n')); err != nil {
-					t.Errorf("Failed to write tool request %d: %v", id, err)
-				}
+				requestBytes, err := json.Marshal(toolRequest)
+				if err != nil {
+					t.Errorf("Failed to marshal tool request %d: %v", id, err)
+					return
+				}
+				if _, err := stdinWriter.Write(append(requestBytes, '\n')); err != nil {
+					t.Errorf("Failed to write tool request %d: %v", id, err)
+					return
+				}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 47b4408 and 0ee38c4.

📒 Files selected for processing (2)
  • server/stdio.go (8 hunks)
  • server/stdio_test.go (2 hunks)
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: floatingIce91
PR: mark3labs/mcp-go#401
File: server/server.go:1082-1092
Timestamp: 2025-06-23T11:10:42.948Z
Learning: In Go MCP server, ServerTool.Tool field is only used for tool listing and indexing, not for tool execution or middleware. During handleToolCall, only the Handler field is used, so dynamic tools don't need the Tool field populated.
📚 Learning: in go mcp server, servertool.tool field is only used for tool listing and indexing, not for tool exe...
Learnt from: floatingIce91
PR: mark3labs/mcp-go#401
File: server/server.go:1082-1092
Timestamp: 2025-06-23T11:10:42.948Z
Learning: In Go MCP server, ServerTool.Tool field is only used for tool listing and indexing, not for tool execution or middleware. During handleToolCall, only the Handler field is used, so dynamic tools don't need the Tool field populated.

Applied to files:

  • server/stdio.go
  • server/stdio_test.go
📚 Learning: in the mark3labs/mcp-go project, the mcpserver.capabilities field is a struct value (servercapabilit...
Learnt from: ezynda3
PR: mark3labs/mcp-go#461
File: server/sampling.go:22-26
Timestamp: 2025-06-30T07:13:17.052Z
Learning: In the mark3labs/mcp-go project, the MCPServer.capabilities field is a struct value (serverCapabilities), not a pointer, so it cannot be nil and doesn't require nil checking. Only pointer fields within the capabilities struct should be checked for nil.

Applied to files:

  • server/stdio.go
  • server/stdio_test.go
📚 Learning: tool responses from the mcp server shouldn't contain rawinputschema, which is why the unmarshaljson ...
Learnt from: xinwo
PR: mark3labs/mcp-go#35
File: mcp/tools.go:107-137
Timestamp: 2025-03-04T06:59:43.882Z
Learning: Tool responses from the MCP server shouldn't contain RawInputSchema, which is why the UnmarshalJSON method for the Tool struct is implemented to handle only the structured InputSchema format.

Applied to files:

  • server/stdio.go
  • server/stdio_test.go
📚 Learning: in the mcptest package, prefer returning errors from helper functions rather than calling t.fatalf()...
Learnt from: octo
PR: mark3labs/mcp-go#149
File: mcptest/mcptest.go:0-0
Timestamp: 2025-04-21T21:26:32.945Z
Learning: In the mcptest package, prefer returning errors from helper functions rather than calling t.Fatalf() directly, giving callers flexibility in how to handle errors.

Applied to files:

  • server/stdio_test.go
📚 Learning: the tool struct in the mark3labs/mcp-go project should handle both inputschema and rawinputschema co...
Learnt from: xinwo
PR: mark3labs/mcp-go#35
File: mcp/tools.go:0-0
Timestamp: 2025-03-04T07:00:57.111Z
Learning: The Tool struct in the mark3labs/mcp-go project should handle both InputSchema and RawInputSchema consistently between MarshalJSON and UnmarshalJSON methods, even though the tools response from MCP server typically doesn't contain rawInputSchema.

Applied to files:

  • server/stdio_test.go
📚 Learning: the tool struct in mark3labs/mcp-go handles both inputschema and rawinputschema formats. when unmars...
Learnt from: xinwo
PR: mark3labs/mcp-go#35
File: mcp/tools.go:0-0
Timestamp: 2025-03-04T07:00:57.111Z
Learning: The Tool struct in mark3labs/mcp-go handles both InputSchema and RawInputSchema formats. When unmarshaling JSON, it first tries to parse into a structured ToolInputSchema format, and if that fails, it falls back to using the raw schema format, providing symmetry with the MarshalJSON method.

Applied to files:

  • server/stdio_test.go
📚 Learning: in mcp-go project, the maintainer prefers keeping builder pattern apis simple without excessive vali...
Learnt from: davidleitw
PR: mark3labs/mcp-go#451
File: mcp/tools.go:1192-1217
Timestamp: 2025-06-26T09:38:18.629Z
Learning: In mcp-go project, the maintainer prefers keeping builder pattern APIs simple without excessive validation for edge cases. The WithOutput* functions are designed to assume correct usage rather than defensive programming, following the principle of API simplicity over comprehensive validation.

Applied to files:

  • server/stdio_test.go
📚 Learning: in the mcp-go library, the getpromptparams.arguments field is of type map[string]string, not map[str...
Learnt from: lariel-fernandes
PR: mark3labs/mcp-go#428
File: www/docs/pages/servers/prompts.mdx:218-234
Timestamp: 2025-06-20T20:39:51.870Z
Learning: In the mcp-go library, the GetPromptParams.Arguments field is of type map[string]string, not map[string]interface{}, so direct string access without type assertions is safe and correct.

Applied to files:

  • server/stdio_test.go
🧬 Code Graph Analysis (1)
server/stdio_test.go (4)
server/server.go (1)
  • NewMCPServer (299-325)
mcp/tools.go (2)
  • NewTool (658-680)
  • CallToolResult (40-51)
mcp/utils.go (1)
  • NewToolResultText (245-254)
server/stdio.go (3)
  • NewStdioServer (253-264)
  • WithWorkerPoolSize (68-78)
  • WithQueueSize (81-91)
🪛 GitHub Actions: go
server/stdio_test.go

[error] 451-451: Test failure: Expected default worker pool size 5, got 100


[error] 465-465: Test failure: Expected default queue size 100, got 10000

🔇 Additional comments (13)
server/stdio_test.go (5)

7-13: LGTM: Import additions are appropriate.

The new imports (fmt, sync, time) are correctly added to support the concurrent tool call testing functionality.


286-303: Well-designed concurrent tool setup.

The test creates multiple tools with proper closure capture and thread-safe counting using sync.Map. The simulation of work with time.Sleep effectively tests the concurrent processing capabilities.


389-400: Good use of labeled break for control flow.

The use of collectLoop label with break collectLoop is more idiomatic than the previous goto approach and improves code readability.


455-459: LGTM: Valid configuration values are properly tested.

The test correctly verifies that valid values within bounds are accepted and stored properly.


476-484: LGTM: Zero and negative value handling is properly tested.

The test correctly verifies that invalid values (zero and negative) revert to default values as expected.

server/stdio.go (8)

32-46: Well-designed concurrent architecture.

The addition of worker pool fields (toolCallQueue, workerWg, workerPoolSize, queueSize) and write protection (writeMu) properly addresses the race condition described in the PR objectives. The toolCallWork struct cleanly encapsulates queued work items.


67-78: LGTM: Worker pool size configuration with proper bounds.

The configuration function correctly enforces the maximum limit and uses the maximum value when exceeded, which aligns with user expectations for larger pool sizes.


80-91: LGTM: Queue size configuration with proper bounds.

The configuration function correctly enforces the maximum limit and uses the maximum value when exceeded, which is appropriate for queue sizing.


326-348: Excellent worker implementation with proper cleanup.

The worker function correctly:

  • Uses defer s.workerWg.Done() for proper WaitGroup cleanup
  • Handles channel closure with the ok check
  • Processes work items through the existing HandleMessage infrastructure
  • Respects context cancellation

This design maintains the existing MCP server functionality while adding concurrency safety.


384-421: Robust worker pool lifecycle management.

The Listen function properly:

  • Initializes the buffered queue with the configured size
  • Starts the worker pool before processing messages
  • Gracefully shuts down workers by closing the queue and waiting for completion

The sequential shutdown approach (close queue → wait for workers → return) ensures all queued work is completed before termination.


452-472: Smart queuing strategy with graceful degradation.

The implementation correctly:

  • Identifies tool calls that need concurrent processing
  • Uses non-blocking channel send with select and default case
  • Falls back to synchronous processing when the queue is full
  • Logs queue overflow situations for monitoring

This approach prevents blocking while maintaining system stability under high load.


557-567: Essential write protection for concurrent safety.

The addition of writeMu.Lock() protection correctly addresses the concurrent write issues described in the PR objectives. This ensures that multiple workers cannot corrupt the output stream when writing responses simultaneously.


261-262: LGTM: Reasonable default values for worker pool configuration.

The default values (5 workers, 100 queue size) provide a good balance between concurrency and resource usage for typical MCP server workloads.

Update test expectations to match the new behavior where WithWorkerPoolSize
and WithQueueSize use maximum values instead of defaults when the requested
size exceeds limits.

🤖 Generated with [opencode](https://opencode.ai)

Co-Authored-By: opencode <noreply@opencode.ai>
@ezynda3 ezynda3 merged commit 9393526 into main Aug 6, 2025
4 checks passed
@pottekkat pottekkat deleted the fix-stdio-race-condition branch August 7, 2025 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: bufio.Reader is not thread-safe
4 participants