Skip to content

[Feature] Event record for failed Pod creation #2250

@Eikykun

Description

@Eikykun

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

Recently, while using RayCluster, a user configured an invalid label in the pod template. I could only discover this issue through the logs of RayOperator. Perhaps, we could use the following methods to help us troubleshoot or avoid such issues more quickly:

  • Record relevant failure information using EventRecorder when pod creation fails.
  • Add validation logic for the pod template in the validating webhook.

Use case

RayCluster troubleshooting

Related issues

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions