-
Notifications
You must be signed in to change notification settings - Fork 618
[DOCS] Apiserver improve docs readability #3564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
f6ddc5a
f1462dc
3d6d9e1
86e7b96
2656969
7273564
3cb8ed5
7c55999
93bc4f4
ac81be2
f582078
a9fb8d4
6182e4f
6abc57d
cc675ec
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,122 +1,99 @@ | ||
# Creating Autoscaling clusters using API server | ||
# Creating Autoscaling clusters using APIServer | ||
|
||
One of the fundamental features of Ray is autoscaling. This [document] describes how to set up | ||
autoscaling using Ray operator. Here we will describe how to set it up using API server. | ||
One of Ray's key features is autoscaling. This [document] explains how to set up autoscaling | ||
with the Ray operator. Here, we demonstrate how to configure it using the APIServer and | ||
run an example. | ||
|
||
## Deploy KubeRay operator and API server | ||
## Setup | ||
|
||
Refer to [readme](README.md) for setting up KubRay operator and API server. | ||
Refer to the [README](README.md) for setting up the KubeRay operator and APIServer. | ||
|
||
```shell | ||
make operator-image cluster load-operator-image deploy-operator | ||
dentiny marked this conversation as resolved.
Show resolved
Hide resolved
|
||
## Example | ||
|
||
This example walks through how to trigger scale-up and scale-down for RayCluster. | ||
|
||
Before proceeding with the example, remove any running RayClusters to ensure a successful | ||
execution of the steps below. | ||
|
||
```sh | ||
kubectl delete raycluster --all | ||
``` | ||
|
||
Alternatively, you could build and deploy the Operator and API server from local repo for | ||
development purpose. | ||
> [!IMPORTANT] | ||
> All the following guidance requires you to switch your working directory to the KubeRay `apiserver` | ||
|
||
### Install ConfigMap | ||
|
||
```shell | ||
make operator-image cluster load-operator-image deploy-operator docker-image load-image deploy | ||
Install this [ConfigMap], which contains the code for our example. Simply download | ||
the file and run: | ||
|
||
```sh | ||
kubectl apply -f test/cluster/cluster/detachedactor.yaml | ||
``` | ||
|
||
Additionally install this [ConfigMap] containing code that we will use for testing. | ||
Check if the ConfigMap is successfully created. You should see `ray-example` in the list: | ||
|
||
```sh | ||
kubectl get configmaps | ||
# NAME DATA AGE | ||
# ray-example 2 8s | ||
``` | ||
|
||
## Deploy Ray cluster | ||
### Deploy RayCluster | ||
|
||
Once they are set up, you first need to create a Ray cluster using the following commands: | ||
Before running the example, deploy a RayCluster with the following command: | ||
|
||
```shell | ||
```sh | ||
# Create compute template | ||
curl -X POST 'localhost:31888/apis/v1/namespaces/default/compute_templates' \ | ||
--header 'Content-Type: application/json' \ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you assuming users will install the KubeRay API server outside the Kubernetes cluster? The documentation should instruct users to install the KubeRay API server inside the cluster. Then, you can ask users to either set up port forwarding or create a curl pod to submit requests. |
||
--data '{ | ||
"name": "default-template", | ||
"namespace": "default", | ||
"cpu": 2, | ||
"memory": 4 | ||
}' | ||
--data @docs/api-example/compute_template.json | ||
|
||
# Create RayCluster | ||
curl -X POST 'localhost:31888/apis/v1/namespaces/default/clusters' \ | ||
--header 'Content-Type: application/json' \ | ||
--data '{ | ||
"name": "test-cluster", | ||
"namespace": "default", | ||
"user": "boris", | ||
"clusterSpec": { | ||
"enableInTreeAutoscaling": true, | ||
"autoscalerOptions": { | ||
"upscalingMode": "Default", | ||
"idleTimeoutSeconds": 30, | ||
"cpu": "500m", | ||
"memory": "512Mi" | ||
}, | ||
"headGroupSpec": { | ||
"computeTemplate": "default-template", | ||
"image": "rayproject/ray:2.9.0-py310", | ||
"serviceType": "NodePort", | ||
"rayStartParams": { | ||
"dashboard-host": "0.0.0.0", | ||
"metrics-export-port": "8080", | ||
"num-cpus": "0" | ||
}, | ||
"volumes": [ | ||
{ | ||
"name": "code-sample", | ||
"mountPath": "/home/ray/samples", | ||
"volumeType": "CONFIGMAP", | ||
"source": "ray-example", | ||
"items": { | ||
"detached_actor.py": "detached_actor.py", | ||
"terminate_detached_actor.py": "terminate_detached_actor.py" | ||
} | ||
} | ||
] | ||
}, | ||
"workerGroupSpec": [ | ||
{ | ||
"groupName": "small-wg", | ||
"computeTemplate": "default-template", | ||
"image": "rayproject/ray:2.9.0-py310", | ||
"replicas": 0, | ||
"minReplicas": 0, | ||
"maxReplicas": 5, | ||
"rayStartParams": { | ||
"node-ip-address": "$MY_POD_IP" | ||
}, | ||
"volumes": [ | ||
{ | ||
"name": "code-sample", | ||
"mountPath": "/home/ray/samples", | ||
"volumeType": "CONFIGMAP", | ||
"source": "ray-example", | ||
"items": { | ||
"detached_actor.py": "detached_actor.py", | ||
"terminate_detached_actor.py": "terminate_detached_actor.py" | ||
} | ||
} | ||
] | ||
} | ||
] | ||
} | ||
}' | ||
--data @docs/api-example/autoscaling_clusters.json | ||
``` | ||
|
||
## Validate that Ray cluster is deployed correctly | ||
This command performs two main operations: | ||
|
||
Run: | ||
1. Creates a compute template `default-template` that specifies resources to use during | ||
scale-up (2 CPUs and 4 GiB memory). | ||
|
||
```shell | ||
kubectl get pods | ||
``` | ||
2. Deploys a RayCluster (test-cluster) with: | ||
- A head pod that manages the cluster | ||
- A worker group configured to scale between 0 and 5 replicas | ||
|
||
dentiny marked this conversation as resolved.
Show resolved
Hide resolved
|
||
The worker group uses the following autoscalerOptions to control scaling behavior: | ||
|
||
- **`upscalingMode: "Default"`**: Default scaling behavior. Ray will scale up only as | ||
needed. | ||
- **`idleTimeoutSeconds: 30`**: If a worker pod remains idle (i.e., not running any tasks) | ||
for 30 seconds, it will be automatically removed. | ||
- **`cpu: "500m"`, `memory: "512Mi"`**: Defines the **minimum resource unit** Ray uses to | ||
dentiny marked this conversation as resolved.
Show resolved
Hide resolved
|
||
assess scaling needs. If no worker pod has at least this much free capacity, Ray will | ||
trigger a scale-up and launch a new worker pod. | ||
|
||
You should get something like this: | ||
> **Note:** These values **do not determine the actual size** of the worker pod. The | ||
> pod size comes from the `computeTemplate` (in this case, 2 CPUs and 4 GiB memory). | ||
|
||
```shell | ||
test-cluster-head-pr25j 2/2 Running 0 2m49s | ||
### Validate that RayCluster is deployed correctly | ||
|
||
Run the following command to get a list of pods running. You should see something like below: | ||
|
||
```sh | ||
kubectl get pods | ||
# NAME READY STATUS RESTARTS AGE | ||
# kuberay-operator-545586d46c-f9grr 1/1 Running 0 49m | ||
# test-cluster-head 2/2 Running 0 3m1s | ||
``` | ||
|
||
Note that only head pod is running and it has 2 containers | ||
Note that there is no worker for `test-cluster` as we set its initial replicas to 0. You | ||
will only see head pod with 2 containers for `test-cluster`. | ||
|
||
## Trigger RayCluster scale-up | ||
### Trigger RayCluster scale-up | ||
|
||
Create a detached actor: | ||
Create a detached actor to trigger scale-up with the following command: | ||
|
||
```sh | ||
curl -X POST 'localhost:31888/apis/v1/namespaces/default/jobs' \ | ||
|
@@ -132,24 +109,26 @@ curl -X POST 'localhost:31888/apis/v1/namespaces/default/jobs' \ | |
}' | ||
``` | ||
|
||
Because we have specified `num_cpu: 0` for head node, this will cause creation of a worker node. Run: | ||
The `detached_actor.py` file is defined in the [ConfigMap] we installed earlier and | ||
mounted to the head node, which requires `num_cpus=1`. Recall that initially there is no | ||
worker pod exists, RayCluster needs to scale up a worker for running this actor. | ||
|
||
```shell | ||
kubectl get pods | ||
``` | ||
Check if a worker is created. You should see a worker `test-cluster-small-wg-worker` spin | ||
up. | ||
|
||
You should get something like this: | ||
```sh | ||
kubectl get pods | ||
|
||
```shell | ||
test-cluster-head-pr25j 2/2 Running 0 15m | ||
test-cluster-worker-small-wg-qrjfm 1/1 Running 0 2m48s | ||
# NAME READY STATUS RESTARTS AGE | ||
# create-actor-tsvfc 0/1 Completed 0 99s | ||
# kuberay-operator-545586d46c-f9grr 1/1 Running 0 55m | ||
# test-cluster-head 2/2 Running 0 9m37s | ||
# test-cluster-small-wg-worker-j54xf 1/1 Running 0 88s | ||
``` | ||
|
||
You can see that a worker node have been created. | ||
### Trigger RayCluster scale-down | ||
|
||
## Trigger RayCluster scale-down | ||
|
||
Run: | ||
Run the following command to delete the actor we created earlier: | ||
|
||
```sh | ||
curl -X POST 'localhost:31888/apis/v1/namespaces/default/jobs' \ | ||
|
@@ -165,16 +144,28 @@ curl -X POST 'localhost:31888/apis/v1/namespaces/default/jobs' \ | |
}' | ||
``` | ||
|
||
A worker Pod will be deleted after `idleTimeoutSeconds` (default 60s, we specified 30) seconds. Run: | ||
Once the actor is deleted, the worker is no longer needed. The worker pod will be deleted | ||
after `idleTimeoutSeconds` (default 60; we specified 30) seconds. | ||
|
||
List all pods to verify that the worker pod is deleted: | ||
|
||
```shell | ||
```sh | ||
kubectl get pods | ||
|
||
# NAME READY STATUS RESTARTS AGE | ||
# create-actor-tsvfc 0/1 Completed 0 6m37s | ||
dentiny marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# delete-actor-89z8c 0/1 Completed 0 83s | ||
# kuberay-operator-545586d46c-f9grr 1/1 Running 0 60m | ||
# test-cluster-head 2/2 Running 0 14m | ||
|
||
``` | ||
|
||
And you should see only head node (worker node is deleted) | ||
### Clean up | ||
|
||
```shell | ||
test-cluster-head-pr25j 2/2 Running 0 27m | ||
```sh | ||
make clean-cluster | ||
# Remove apiserver from helm | ||
helm uninstall kuberay-apiserver | ||
``` | ||
|
||
[document]: https://docs.ray.io/en/latest/cluster/kubernetes/user-guides/configuring-autoscaling.html | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two sections in "Installation". Please explicitly ask users to follow "Install with Helm".
"Start a local apiserver" should only be used for development.