Skip to content

Conversation

yzlnew
Copy link
Contributor

@yzlnew yzlnew commented May 30, 2025

Checklist Before Starting

  • Search for similar PR(s).

What does this PR do?

Add an example for DeepSeek 671B GRPO

Specific Changes

ray.exceptions.RaySystemError: System error: Failed to unpickle serialized exception
traceback: Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ray/exceptions.py", line 46, in from_ray_exception
    return pickle.loads(ray_exception.serialized_exception)
TypeError: BackendCompilerFailed.__init__() missing 1 required positional argument: 'inner_exception'

Additional Info.

image

Checklist Before Submitting

  • Read the Contribute Guide.
  • Apply pre-commit checks.
  • Add [BREAKING] to the PR title if it breaks any API.
  • Update the documentation about your changes in the docs.
  • Add CI test(s) if necessary.

@CLAassistant
Copy link

CLAassistant commented May 30, 2025

CLA assistant check
All committers have signed the CLA.

yzlnew added 2 commits June 1, 2025 10:30
working at mcore 0.12.0, would hang if below this version
@yzlnew yzlnew marked this pull request as ready for review June 4, 2025 09:50
Copy link
Collaborator

@ccclyu ccclyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! could you also add training curve like wandb link in the PR description for reference? Thanks!

@vermouth1992 vermouth1992 merged commit a6f15ae into volcengine:main Jun 5, 2025
3 checks passed
@ccclyu
Copy link
Collaborator

ccclyu commented Jun 5, 2025

Screenshot 2025-06-05 at 16 55 33 Also reproduced the Deepseek V3 GRPO training with this config and the gsm8k validation accuracy curve is attached.

yellowbee686 pushed a commit to yellowbee686/verl that referenced this pull request Jun 6, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

Add an example for DeepSeek 671B GRPO

### Specific Changes

- Need volcengine#1694
- Set `torch._dynamo.config.suppress_errors = True` at entrypoint, if 

```
ray.exceptions.RaySystemError: System error: Failed to unpickle serialized exception
traceback: Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ray/exceptions.py", line 46, in from_ray_exception
    return pickle.loads(ray_exception.serialized_exception)
TypeError: BackendCompilerFailed.__init__() missing 1 required positional argument: 'inner_exception'
```

### Additional Info.

- vllm as backend, sglang working in process
(sgl-project/sglang#6762). Merged when both
backends are ready.
- For DeepSeek-V3-0324 at `gsm8k`, the reward starts from 0.8 and
saturated at around 0.95 using only 3 steps.
- Memory peaks around 90GB during actor update (1.5k input + 2.5k
output), consider using TP/ETP for a lower requirement.
- For gsm8k training using this yaml,


![image](https://github.com/user-attachments/assets/d16cf959-5845-4dd0-95af-07fc35820f18)


### Checklist Before Submitting

- [ ] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [ ] Add `[BREAKING]` to the PR title if it breaks any API.
- [ ] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add CI test(s) if necessary.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants