Skip to content

Conversation

larrylian
Copy link
Contributor

@larrylian larrylian commented Jul 12, 2023

Why are these changes needed?

Implement node label scheduling strategy

Related issue number

#34894
(P2)Implement the node affinity with labels interface in Python and transparently transmit it to the CoreWorker.
(P2)Implement the node affinity with labels scheduling policy.

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@larrylian larrylian force-pushed the labels_scheduling_3 branch from 819e95a to e8fa814 Compare July 13, 2023 02:24
@larrylian larrylian changed the title [Core][Label scheduling 2/n] Add node label scheduling strategy [Core][Label scheduling 3/n] Implement node label scheduling strategy Jul 18, 2023
@larrylian larrylian force-pushed the labels_scheduling_3 branch from e8fa814 to 4088418 Compare July 18, 2023 10:11
@larrylian larrylian requested a review from jjyao July 18, 2023 10:13
const auto &soft_expressions = node_label_scheduling_strategy.soft();
if (soft_expressions.expressions().size() > 0) {
// soft match
auto match_count_map = ScoreNodesBySoftExpressions(candidate_nodes, soft_expressions);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to score soft match: you either match the entire expressions or not. There is no partial match. We only need to score after we support OR with weights.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's okay. We can add it if necessary in the future.

@larrylian larrylian force-pushed the labels_scheduling_3 branch from 4088418 to 0ddeee6 Compare July 21, 2023 09:57
@larrylian larrylian requested a review from jjyao July 21, 2023 10:02
Copy link
Collaborator

@jjyao jjyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will continue.

@@ -132,6 +136,9 @@ struct hash<ray::rpc::SchedulingStrategy> {
hash ^=
static_cast<size_t>(scheduling_strategy.placement_group_scheduling_strategy()
.placement_group_capture_child_tasks());
} else if (scheduling_strategy.has_node_label_scheduling_strategy()) {
hash ^= std::hash<std::string>()(
scheduling_strategy.node_label_scheduling_strategy().DebugString());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SerializeAsString as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash function should guarantee that if two label scheduling strategies are the same, they must have the same hash value. Can we actually examine each field of NodeLabelSchedulingStrategy like what we do for other scheduling strategies?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's at least change to SerializeAsString for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because SerializeAsString is unsafe for hash. I think just use DebugString for now.

@larrylian larrylian force-pushed the labels_scheduling_3 branch from 0ddeee6 to 25ce51f Compare July 24, 2023 03:19
@larrylian larrylian requested a review from jjyao July 24, 2023 03:20
@@ -132,6 +136,9 @@ struct hash<ray::rpc::SchedulingStrategy> {
hash ^=
static_cast<size_t>(scheduling_strategy.placement_group_scheduling_strategy()
.placement_group_capture_child_tasks());
} else if (scheduling_strategy.has_node_label_scheduling_strategy()) {
hash ^= std::hash<std::string>()(
scheduling_strategy.node_label_scheduling_strategy().DebugString());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash function should guarantee that if two label scheduling strategies are the same, they must have the same hash value. Can we actually examine each field of NodeLabelSchedulingStrategy like what we do for other scheduling strategies?

@larrylian larrylian force-pushed the labels_scheduling_3 branch from 25ce51f to c8c71fa Compare July 24, 2023 07:19
@larrylian
Copy link
Contributor Author

The hash function should guarantee that if two label scheduling strategies are the same, they must have the same hash value. Can we actually examine each field of NodeLabelSchedulingStrategy like what we do for other scheduling strategies?

That's fine. I'd prefer to create a new PR for this modification. The struct has quite a few member variables, so the code will be substantial. I'll also write some test cases.

@@ -132,6 +136,9 @@ struct hash<ray::rpc::SchedulingStrategy> {
hash ^=
static_cast<size_t>(scheduling_strategy.placement_group_scheduling_strategy()
.placement_group_capture_child_tasks());
} else if (scheduling_strategy.has_node_label_scheduling_strategy()) {
hash ^= std::hash<std::string>()(
scheduling_strategy.node_label_scheduling_strategy().DebugString());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's at least change to SerializeAsString for now.

Signed-off-by: 稚鱼 <554538252@qq.com>
Signed-off-by: LarryLian <554538252@qq.com>
@larrylian larrylian force-pushed the labels_scheduling_3 branch from c8c71fa to e367a4b Compare July 25, 2023 09:50
@jjyao jjyao merged commit 9680b82 into ray-project:master Jul 25, 2023
NripeshN pushed a commit to NripeshN/ray that referenced this pull request Aug 15, 2023
…ray-project#37339)

Signed-off-by: LarryLian <554538252@qq.com>
Signed-off-by: NripeshN <nn2012@hw.ac.uk>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
…ray-project#37339)

Signed-off-by: LarryLian <554538252@qq.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
…ray-project#37339)

Signed-off-by: LarryLian <554538252@qq.com>
Signed-off-by: Victor <vctr.y.m@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants