Skip to content

Conversation

ZacAttack
Copy link
Contributor

@ZacAttack ZacAttack commented Jul 28, 2025

Why are these changes needed?

Workers spawn extraneous threads that aren't necessary. It's been identified that a major source of thread bloat is coming from the use of grpc in the worker processes. As an incremental step towards a full resolution, this patch adds "RAY_worker_num_grpc_internal_threads" as an environment variable which is honored by the raylet when spawning workers. Setting this environment variable in your base image or in your code will limit the number of threads spawned by grpc in the worker.

An example usage of this api would look like:

import ray
from ray.util.queue import Queue as RayQueue
from ray.runtime_env import RuntimeEnv

# restrict the number of threads for grpc to 1 per worker (this is probably what you want to do)
# be sure this is set before the any call to ray.init or ray.remote
os.environ['RAY_worker_num_grpc_internal_threads'] = '1'

class SimpleActor:
    def __init__(self):
        ....

Related issue number

Closes #36936
Closes #54225

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@ZacAttack ZacAttack added the go add ONLY when ready to merge, run all tests label Jul 28, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @ZacAttack, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've implemented a core change to address thread bloat in Ray worker processes, specifically targeting gRPC's thread usage. This change allows users to explicitly control the number of gRPC threads by setting a new environment variable, RAY_num_grpc_threads, via Ray's RuntimeEnv. This provides an incremental step towards optimizing resource utilization and reducing unnecessary thread overhead in worker processes.

Highlights

  • gRPC Thread Control: I've introduced a new mechanism to control the number of gRPC threads spawned by Ray workers. This is achieved by applying a custom patch to the gRPC library during the build process.
  • RAY_num_grpc_threads Environment Variable: This pull request enables the use of the RAY_num_grpc_threads environment variable. When this variable is set in the RuntimeEnv for a Ray worker, gRPC will use the specified value as its thread pool size. This allows for fine-grained control over thread resource consumption.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

@ZacAttack ZacAttack requested a review from a team July 28, 2025 22:22
@ZacAttack ZacAttack added the core Issues that should be addressed in Ray Core label Jul 28, 2025
@ZacAttack ZacAttack requested a review from a team as a code owner August 4, 2025 21:13
@aslonnie aslonnie removed the request for review from a team August 4, 2025 22:44
@aslonnie
Copy link
Collaborator

aslonnie commented Aug 4, 2025

(does not seem to require ray-ci to review?)

Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Signed-off-by: Zac Policzer <zacattackftw@gmail.com>
Copy link
Collaborator

@edoakes edoakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's name the patch something like grpc-configurable-thread-count

Copy link
Collaborator

@edoakes edoakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, a merge conflict, then LGTM

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
@edoakes
Copy link
Collaborator

edoakes commented Aug 6, 2025

@ZacAttack I fixed merge conflict but please fix DCO (sign off commits)

@edoakes edoakes enabled auto-merge (squash) August 6, 2025 22:30
Copy link
Contributor

@israbbani israbbani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢 nicely done!

@github-actions github-actions bot disabled auto-merge August 6, 2025 23:55
@jjyao jjyao merged commit a4910fb into ray-project:master Aug 7, 2025
5 checks passed
sampan-s-nayak pushed a commit that referenced this pull request Aug 12, 2025
…unt (#54988)

Signed-off-by: Zac Policzer <zacattackftw@gmail.com>
Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: zac <zac@anyscale.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: sampan <sampan@anyscale.com>
dioptre pushed a commit to sourcetable/ray that referenced this pull request Aug 20, 2025
…unt (ray-project#54988)

Signed-off-by: Zac Policzer <zacattackftw@gmail.com>
Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: zac <zac@anyscale.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Andrew Grosser <dioptre@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Core] thread creation error, even with environment variables all set to 1 [Core] Too many threads in ray worker
6 participants