[trainer] fix: make `reward_extra_info` optional in `reward_result` #2109

HollowMan6 · 2025-06-19T15:59:23Z

Checklist Before Starting

Searched for similar PR(s).
Checked PR Title format
- In format of: [modules] type: Title
- modules are in fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data
- type is in feat, fix, refactor, chore, test
- can involve multiple modules, seperated by , or space, like [megatron, fsdp, doc] feat: xxx

What does this PR do?

Fix the error message: Error in reward_fn: reward_extra_info, as for some reward function implementation, only reward_tensor is included in the returned dictionary.

verl/verl/workers/reward_manager/prime.py

Line 176 in b401382

return {"reward_tensor": reward_tensor}
verl/examples/split_placement/main_ppo_split.py

Line 88 in b401382

return {"reward_tensor": reward_tensor}

Checklist Before Submitting

Read the Contribute Guide.
Apply pre-commit checks.
Add [BREAKING] to the PR title description if it breaks any API.
Update the documentation about your changes in the docs.
New CI unit test(s) are added to cover the code path.
Rely on existing unit tests on CI that covers the code path.

Fix the error message: `Error in reward_fn: reward_extra_info`, as for some reward function implementation, only `reward_tensor` is included in the returned dictionary. - https://github.com/volcengine/verl/blob/b401382405304436292ca19870d1917a0174d09c/verl/workers/reward_manager/prime.py#L176 - https://github.com/volcengine/verl/blob/b401382405304436292ca19870d1917a0174d09c/examples/split_placement/main_ppo_split.py#L88 Signed-off-by: Hollow Man <hollowman@opensuse.org>

tongyx361

LGTM!

…olcengine#2109) ### Checklist Before Starting - [X] Searched for similar PR(s). - [X] Checked PR Title format - In format of: [modules] type: Title - modules are in `fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data` - type is in `feat, fix, refactor, chore, test` - can involve multiple modules, seperated by `,` or space, like `[megatron, fsdp, doc] feat: xxx` ### What does this PR do? Fix the error message: `Error in reward_fn: reward_extra_info`, as for some reward function implementation, only `reward_tensor` is included in the returned dictionary. - https://github.com/volcengine/verl/blob/b401382405304436292ca19870d1917a0174d09c/verl/workers/reward_manager/prime.py#L176 - https://github.com/volcengine/verl/blob/b401382405304436292ca19870d1917a0174d09c/examples/split_placement/main_ppo_split.py#L88 ### Checklist Before Submitting - [X] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [X] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [X] Add `[BREAKING]` to the PR title `description` if it breaks any API. - [X] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [X] New CI unit test(s) are added to cover the code path. - [X] Rely on existing unit tests on CI that covers the code path. Signed-off-by: Hollow Man <hollowman@opensuse.org>

…olcengine#2109) ### Checklist Before Starting - [X] Searched for similar PR(s). - [X] Checked PR Title format - In format of: [modules] type: Title - modules are in `fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data` - type is in `feat, fix, refactor, chore, test` - can involve multiple modules, seperated by `,` or space, like `[megatron, fsdp, doc] feat: xxx` ### What does this PR do? Fix the error message: `Error in reward_fn: reward_extra_info`, as for some reward function implementation, only `reward_tensor` is included in the returned dictionary. - https://github.com/volcengine/verl/blob/0df9f879560136af9fabf08d4bf3b7705f0663eb/verl/workers/reward_manager/prime.py#L176 - https://github.com/volcengine/verl/blob/0df9f879560136af9fabf08d4bf3b7705f0663eb/examples/split_placement/main_ppo_split.py#L88 ### Checklist Before Submitting - [X] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [X] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [X] Add `[BREAKING]` to the PR title `description` if it breaks any API. - [X] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [X] New CI unit test(s) are added to cover the code path. - [X] Rely on existing unit tests on CI that covers the code path. Signed-off-by: Hollow Man <hollowman@opensuse.org>

HollowMan6 requested review from eric-haibin-lin, vermouth1992, tongyx361 and PeterSH6 as code owners June 19, 2025 15:59

tongyx361 self-assigned this Jun 19, 2025

tongyx361 approved these changes Jun 19, 2025

View reviewed changes

tongyx361 merged commit 18c2825 into volcengine:main Jun 19, 2025
30 of 31 checks passed

HollowMan6 deleted the reward_extra_info branch June 20, 2025 02:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[trainer] fix: make `reward_extra_info` optional in `reward_result` #2109

[trainer] fix: make `reward_extra_info` optional in `reward_result` #2109

Uh oh!

HollowMan6 commented Jun 19, 2025

Uh oh!

tongyx361 left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[trainer] fix: make reward_extra_info optional in reward_result #2109

[trainer] fix: make reward_extra_info optional in reward_result #2109

Uh oh!

Conversation

HollowMan6 commented Jun 19, 2025

Checklist Before Starting

What does this PR do?

Checklist Before Submitting

Uh oh!

tongyx361 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

[trainer] fix: make `reward_extra_info` optional in `reward_result` #2109

[trainer] fix: make `reward_extra_info` optional in `reward_result` #2109

tongyx361 left a comment •

edited

Loading