Skip to content

Add RayClusterReady Condition Type #2271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 31, 2024

Conversation

Yicheng-Lu-llll
Copy link
Contributor

@Yicheng-Lu-llll Yicheng-Lu-llll commented Jul 26, 2024

Why are these changes needed?

This PR adds RayClusterReady condition type. See here for the reason: https://docs.google.com/document/d/1bRL0cZa87eCX6SI7gqthN68CgmHaB6l3-vJuIse-BrY/edit.

RayClusterReady indicates whether all Ray Pods are ready when the RayCluster is first created.

  Conditions:
    Last Transition Time:   2024-07-30T05:58:56Z
    Message:                
    Reason:                 HeadPodRunningAndReady
    Status:                 True
    Type:                   HeadPodReady
    Last Transition Time:   2024-07-30T05:58:56Z
    Message:                All Ray Pods are ready. Future checks focus on the head
    Reason:                 AllPodRunningAndReady
    Status:                 True
    Type:                   RayClusterReady

After RayClusterReady is set to true for the first time, it only indicates whether the RayCluster's Head Pod is ready for

    Last Transition Time:   2024-07-30T05:58:56Z
    Message:                
    Reason:                 HeadPodRunningAndReady
    Status:                 True
    Type:                   HeadPodReady
    Last Transition Time:   2024-07-30T05:58:56Z
    Message:                Only check head after all Ray Pods are initially ready
    Reason:                 HeadPodRunningAndReady
    Status:                 True
    Type:                   RayClusterReady

Related issue number

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@kevin85421 kevin85421 self-assigned this Jul 27, 2024
@Yicheng-Lu-llll Yicheng-Lu-llll changed the title WIP: Redefine Rayv1 Ready WIP: Add RayClusterReady Condition Type Jul 28, 2024
@@ -168,14 +168,19 @@ type RayClusterConditionType string

// Custom Reason for RayClusterCondition
const (
AllPodRunningAndReady = "AllPodRunningAndReady"
HeadPodNotFound = "HeadPodNotFound"
// PodRunningAndReady says that the pod is running and ready.
PodRunningAndReady = "PodRunningAndReady"
Copy link
Contributor Author

@Yicheng-Lu-llll Yicheng-Lu-llll Jul 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually prefer changing PodRunningAndReady to HeadPodRunningAndReady to make it more explicit. This also makes it easier for me to copy the Reason field from HeadPodReady to RayClusterReady.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually prefer changing PodRunningAndReady to HeadPodRunningAndReady to make it more explicit.

makes sense to me

// PodRunningAndReady says that the pod is running and ready.
PodRunningAndReady = "PodRunningAndReady"
// UnknownReason says that the reason for the condition is unknown.
UnknownReason = "Unknown"
)

const (
// HeadPodReady is added in a RayCluster when its Head Pod is ready for requests.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this comment a little bit because HeadPodReady will always be there no matter whether the Head Pod is ready or not; it is just its status that will be true or false.

@Yicheng-Lu-llll Yicheng-Lu-llll changed the title WIP: Add RayClusterReady Condition Type Add RayClusterReady Condition Type Jul 28, 2024
@Yicheng-Lu-llll Yicheng-Lu-llll marked this pull request as ready for review July 28, 2024 05:56
@@ -168,14 +168,19 @@ type RayClusterConditionType string

// Custom Reason for RayClusterCondition
const (
AllPodRunningAndReady = "AllPodRunningAndReady"
HeadPodNotFound = "HeadPodNotFound"
// PodRunningAndReady says that the pod is running and ready.
PodRunningAndReady = "PodRunningAndReady"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually prefer changing PodRunningAndReady to HeadPodRunningAndReady to make it more explicit.

makes sense to me

// PodRunningAndReady says that the pod is running and ready.
PodRunningAndReady = "PodRunningAndReady"
// UnknownReason says that the reason for the condition is unknown.
UnknownReason = "Unknown"
)

const (
// HeadPodReady is added in a RayCluster when its Head Pod is ready for requests.
// RayClusterReady indicates whether all Ray Pods are ready when the RayCluster is first created.
// After RayClusterReady is set to true for the first time, it only indicates whether the RayCluster's Head Pod is ready for requests.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// After RayClusterReady is set to true for the first time, it only indicates whether the RayCluster's Head Pod is ready for requests.
// After RayClusterReady is set to true for the first time, it only indicates whether the RayCluster's head Pod is ready for requests.

// RayClusterReady indicates whether all Ray Pods are ready when the RayCluster is first created.
// After RayClusterReady is set to true for the first time, it only indicates whether the RayCluster's Head Pod is ready for requests.
RayClusterReady RayClusterConditionType = "RayClusterReady"
// HeadPodReady indicates weather RayCluster's Head Pod is ready for requests.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// HeadPodReady indicates weather RayCluster's Head Pod is ready for requests.
// HeadPodReady indicates weather RayCluster's head Pod is ready for requests.

Type: string(rayv1.RayClusterReady),
Status: headPodReadyCondition.Status,
Reason: headPodReadyCondition.Reason,
Message: headPodReadyCondition.Message,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update Message

meta.SetStatusCondition(&newInstance.Status.Conditions, metav1.Condition{
Type: string(rayv1.RayClusterReady),
Status: metav1.ConditionTrue,
Reason: rayv1.AllPodRunningAndReady,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add Message

@@ -1717,8 +1717,6 @@ func TestCalculateStatus(t *testing.T) {
// Test CheckRayHeadRunningAndReady with head pod running and ready
newInstance, _ = r.calculateStatus(ctx, testRayCluster, nil)
assert.True(t, meta.IsStatusConditionPresentAndEqual(newInstance.Status.Conditions, string(rayv1.HeadPodReady), metav1.ConditionTrue))
condition := meta.FindStatusCondition(newInstance.Status.Conditions, string(rayv1.HeadPodReady))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is the same as assert.True(t, meta.IsStatusConditionPresentAndEqual(newInstance.Status.Conditions, string(rayv1.HeadPodReady), metav1.ConditionTrue)), so I deleted it.


// Test reconcilePodsErr with the feature gate enabled
newInstance, err = r.calculateStatus(ctx, testRayCluster, utils.ErrFailedCreateHeadPod)
assert.Nil(t, err)
assert.True(t, meta.IsStatusConditionPresentAndEqual(newInstance.Status.Conditions, string(rayv1.RayClusterReplicaFailure), metav1.ConditionTrue))
}

func TestRayClusterReadyCondition(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After several tries, I decided to make it a separate function with TestCalculateStatus. Otherwise, it is too difficult to write with so many things coupled together. I think the test is fine being a little long, and these tests will be rewritten to use a real client in the future anyway.

// RayClusterReady indicates whether all Ray Pods are ready when the RayCluster is first created.
// After RayClusterReady is set to true for the first time, it only indicates whether the RayCluster's head Pod is ready for requests.
RayClusterReady RayClusterConditionType = "RayClusterReady"
// HeadPodReady indicates weather RayCluster's head Pod is ready for requests.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// HeadPodReady indicates weather RayCluster's head Pod is ready for requests.
// HeadPodReady indicates whether RayCluster's head Pod is ready for requests.

_ = fakeClient.Status().Update(ctx, headPod)
_ = fakeClient.Status().Update(ctx, workerPod)
testRayCluster, _ = r.calculateStatus(ctx, testRayCluster, nil)
assert.Nil(t, meta.FindStatusCondition(testRayCluster.Status.Conditions, string(rayv1.RayClusterReady)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also check reasons or messages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. I did not check line 1814 because RayClusterReady should not be added at that time.

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
@kevin85421 kevin85421 merged commit cbaf5d7 into ray-project:master Jul 31, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants