Skip to content

[Feature] Light-weight job submitter #2537

@kevin85421

Description

@kevin85421

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

Implement a lightweight job submitter that provides the same interface as ray job submit. However, it has no Ray dependencies and instead calls the Ray dashboard's RESTful API. This has two benefits:

  • This allows the K8s job submitter to avoid pulling the Ray image, which is typically over 1 GB even in its thinnest version without ML libraries. This will enhance the startup time of RayJob.

  • We can implement our retry logic if there are network issues between the K8s Job submitter and Ray head to avoid [Bug] RayJob falsely marked as "Running" when driver fails #2154.

We attempted to upstream some changes to Ray but encountered pushback, so KubeRay should consider implementing the solution independently.

Use case

No response

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions