Skip to content

Conversation

rueian
Copy link
Contributor

@rueian rueian commented Aug 3, 2024

@MortalHappiness found the previous implementation of RayClusterReplicaFailureReason failed to capture condition reasons correctly. It made a wrong positional assumption on the error tree passed in so it failed in integration.

This PR removes the wrong assumption by using the errors.As function which traverses the whole tree and finds the first target by the specified pointer type (i.e. the *errRayClusterReplicaFailure). The RayClusterReplicaFailureReason can now return the correct condition reason no matter how the error is wrapped.

FailedCreateHeadPod Example:
image

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests

…the correct reason

Signed-off-by: Rueian <rueiancsie@gmail.com>
@rueian rueian marked this pull request as ready for review August 3, 2024 02:37
@@ -1707,7 +1708,7 @@ func TestCalculateStatus(t *testing.T) {
assert.Equal(t, newInstance.Status.LastUpdateTime, newInstance.Status.StateTransitionTimes[rayv1.Ready])

// Test reconcilePodsErr with the feature gate disabled
newInstance, err = r.calculateStatus(ctx, testRayCluster, utils.ErrFailedCreateHeadPod)
newInstance, err = r.calculateStatus(ctx, testRayCluster, errors.Join(utils.ErrFailedCreateHeadPod, errors.New("invalid")))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Align how we wrap the error in the reconcilePods.

@@ -1740,7 +1741,7 @@ func TestCalculateStatus(t *testing.T) {
assert.True(t, meta.IsStatusConditionPresentAndEqual(newInstance.Status.Conditions, string(rayv1.HeadPodReady), metav1.ConditionFalse))

// Test reconcilePodsErr with the feature gate enabled
newInstance, err = r.calculateStatus(ctx, testRayCluster, utils.ErrFailedCreateHeadPod)
newInstance, err = r.calculateStatus(ctx, testRayCluster, errors.Join(utils.ErrFailedCreateHeadPod, errors.New("invalid")))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Align how we wrap the error in the reconcilePods.

@kevin85421 kevin85421 merged commit 6c168a0 into ray-project:master Aug 3, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants