-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Open
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weekscommunity-backlogcoreIssues that should be addressed in Ray CoreIssues that should be addressed in Ray CoreenhancementRequest for new feature and/or capabilityRequest for new feature and/or capabilityk8s-projK8s and Ray OSSK8s and Ray OSS
Description
Description
Implement support for label selector API:
- (P0) Ray saves label info associated with a node in
GcsNodeInfo
- already implemented - (P0) Update
--labels
argument to take either a list of strings or read from file and expose this API publicly - (P0) Add
label_selector
API to@ray.remote
decorator to schedule tasks/actors - (P0) Update
ClusterResourceScheduler::GetBestSchedulableNode
to enforcelabel_selector
conditions when returning list of candidate nodes. This will eventually replaceSchedulingOptions::NodeLabelScheduling(scheduling_strategy)
. - (P1) Add node labels to runtime context for tasks/actors
- (P1) Add
bundle_label_selector
to theray.util.placement_group
constructor to apply a set oflabel_selector
s to placement group bundles - (P0) Populate list of default labels automatically, currently only supports
ray.io/node-id
, from K8s [Core] Add default Ray Node labels at Node init #53360
Autoscaler adaptation:
- (P1) Update Autoscaler data model to pass label information by adding a labels field to the ResourceRequest message
- (P1) Adapt Ray V2 Autoscaler to parse labels from K8s Pod Spec and generate a
--labels
arg torayStartParams
- Not needed anymore - done by generating default labels from K8s Pod spec in KubeRay: Add default Ray node label info to Ray Pod environment kuberay#3699
- (P1) Update Autoscaler bin packing logic to directly consider label matching
- (P1) Update the Autoscaler code path to handle the label information passed back from GCS
Documentation/Library changes
- (P1) Update documentation/examples to use updated
label_selector
API for non-fallback use cases - (P1) Support passing labels from head and worker group specs in RayCluster CR in KubeRay to Ray nodes
- (P2) Add labels argument to
request_resource()
SDK function used by Ray libraries - (P2) Determine whitelist of K8s labels to always pass to Ray nodes
- (P2) Add
required_labels
to TaskState schema to expose labels in state API
Milestone 2:
- (P0) Update the
label_selector
API in@ray.remote
decorator to support label fallback syntax - (P0) Update the
bundle_label_selector
in theray.util.placement_group
constructor to support label fallback syntax - (P0) Implement label
fallback_strategy
API to match available/feasible nodes by the provided conditions iflabel_selector
returns 0 matches - (P0) Update Autoscaler bin packing logic to support label fallback syntax
- (P1) Update documentation/examples to use updated
label_selector
API for label fallback use cases - (P2) Update library usage of
NodeLabelSchedulingStrategy
, _soft_target_node_id and other related features withlabel_selector
API - (P2) Add deprecation warnings for the
NodeLabelSchedulingStrategy
and other features that will be replaced by label based scheduling
Use case
This issue will serve to track the progress of implementing the label selector API feature enhancement. This enhancement supersedes the previous node affinity feature enhancement REP and continues on much of the implementation there.
Node affinity feature enhancement work tracker: #34894
Related REP: ray-project/enhancements#60
Metadata
Metadata
Assignees
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weekscommunity-backlogcoreIssues that should be addressed in Ray CoreIssues that should be addressed in Ray CoreenhancementRequest for new feature and/or capabilityRequest for new feature and/or capabilityk8s-projK8s and Ray OSSK8s and Ray OSS
Type
Projects
Status
In Progress