-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Add label selector scheduling logic to Cluster Resource Manager #51901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
edoakes
merged 49 commits into
ray-project:master
from
ryanaoleary:schedule-using-labels
May 6, 2025
Merged
Add label selector scheduling logic to Cluster Resource Manager #51901
edoakes
merged 49 commits into
ray-project:master
from
ryanaoleary:schedule-using-labels
May 6, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Fix types Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Fix proto Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Update proto naming to match autoscaler Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Fix errors Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Remove gcs proto Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Fix header file Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Change expression to constraint Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Format Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
87451d8
to
2e93936
Compare
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
23 tasks
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
8 tasks
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Add label_selector to common task spec Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: ryanaoleary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
… label_selector API Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Co-authored-by: Dhyey Shah <dhyey2019@gmail.com> Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: Dhyey Shah <dhyey2019@gmail.com> Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: Dhyey Shah <dhyey2019@gmail.com> Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: Dhyey Shah <dhyey2019@gmail.com> Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: Dhyey Shah <dhyey2019@gmail.com> Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
edoakes
reviewed
May 6, 2025
edoakes
approved these changes
May 6, 2025
8 tasks
edoakes
pushed a commit
that referenced
this pull request
Aug 4, 2025
…pare_label_selector` (#52964) This PR is a follow-up to this comment: #51901 (comment). This PR changes the cluster resource scheduler to propagate a Ray status to `ComputeResources` in `TaskSpecification` when the LabelSelector data type is initialized. This allows a task built with a malformed label selector to return an error as a more useful Python exception rather than crashing Ray components in the C++. #51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
elliot-barn
pushed a commit
that referenced
this pull request
Aug 4, 2025
…pare_label_selector` (#52964) This PR is a follow-up to this comment: #51901 (comment). This PR changes the cluster resource scheduler to propagate a Ray status to `ComputeResources` in `TaskSpecification` when the LabelSelector data type is initialized. This allows a task built with a malformed label selector to return an error as a more useful Python exception rather than crashing Ray components in the C++. #51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
elliot-barn
pushed a commit
that referenced
this pull request
Aug 4, 2025
…pare_label_selector` (#52964) This PR is a follow-up to this comment: #51901 (comment). This PR changes the cluster resource scheduler to propagate a Ray status to `ComputeResources` in `TaskSpecification` when the LabelSelector data type is initialized. This allows a task built with a malformed label selector to return an error as a more useful Python exception rather than crashing Ray components in the C++. #51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
kamil-kaczmarek
pushed a commit
that referenced
this pull request
Aug 4, 2025
…pare_label_selector` (#52964) This PR is a follow-up to this comment: #51901 (comment). This PR changes the cluster resource scheduler to propagate a Ray status to `ComputeResources` in `TaskSpecification` when the LabelSelector data type is initialized. This allows a task built with a malformed label selector to return an error as a more useful Python exception rather than crashing Ray components in the C++. #51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
mjacar
pushed a commit
to mjacar/ray
that referenced
this pull request
Aug 5, 2025
…pare_label_selector` (ray-project#52964) This PR is a follow-up to this comment: ray-project#51901 (comment). This PR changes the cluster resource scheduler to propagate a Ray status to `ComputeResources` in `TaskSpecification` when the LabelSelector data type is initialized. This allows a task built with a malformed label selector to return an error as a more useful Python exception rather than crashing Ray components in the C++. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Signed-off-by: Michael Acar <michael.j.acar@gmail.com>
elliot-barn
pushed a commit
that referenced
this pull request
Aug 5, 2025
…pare_label_selector` (#52964) This PR is a follow-up to this comment: #51901 (comment). This PR changes the cluster resource scheduler to propagate a Ray status to `ComputeResources` in `TaskSpecification` when the LabelSelector data type is initialized. This allows a task built with a malformed label selector to return an error as a more useful Python exception rather than crashing Ray components in the C++. #51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
sampan-s-nayak
pushed a commit
that referenced
this pull request
Aug 12, 2025
…pare_label_selector` (#52964) This PR is a follow-up to this comment: #51901 (comment). This PR changes the cluster resource scheduler to propagate a Ray status to `ComputeResources` in `TaskSpecification` when the LabelSelector data type is initialized. This allows a task built with a malformed label selector to return an error as a more useful Python exception rather than crashing Ray components in the C++. #51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Signed-off-by: sampan <sampan@anyscale.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
community-backlog
community-contribution
Contributed by the community
core
Issues that should be addressed in Ray Core
go
add ONLY when ready to merge, run all tests
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
This PR updates the cluster resource scheduling logic to check whether an eligible node satisfies the given label match expressions when checking if a node
IsSchedulable
. This PR also adds thelabel_selector
option to Task/Actor creation, adds logic to parse strings to theLabelSelector
data structure in the raylet, and passes thelabel_selector
to the core worker to be used when building theTaskSpec
.These changes are to support the label selector API to ensure tasks/actors execute on nodes with the required node labels.
Related issue number
#51564
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.