☕️ GRPO script reward_funcs error #3639

tcapelle · 2025-06-24T10:29:15Z

We should pass the list of functions to the Trainer, and not just the reward model (that is actually None most of the time)

shirinyamani · 2025-06-24T11:40:55Z

Hi @tcapelle, thanks for pointing it.
But i think the reason is technically grpo supports both model and func in terms of acceptable reward, and in this specific example, since reward_funcs=[] while we have a reward_model, that is why we put the reward_model instead of the func.

    # Get the reward models and functions
    reward_funcs = []
    if script_args.reward_model_name_or_path:
        reward_model = AutoModelForSequenceClassification.from_pretrained(
            script_args.reward_model_name_or_path, trust_remote_code=model_args.trust_remote_code, num_labels=1
        )
        reward_funcs.append(reward_model)

    if script_args.reward_funcs:
        for func_name in script_args.reward_funcs:
            if func_name in reward_funcs_registry:
                reward_funcs.append(reward_funcs_registry[func_name])
            elif "." in func_name:
                module_path, func_name = func_name.rsplit(".", 1)
                sys.path.insert(0, os.getcwd())
                module = importlib.import_module(module_path)
                reward_func = getattr(module, func_name)
                reward_funcs.append(reward_func)
            else:
                raise ValueError(
                    f"Could not load reward function '{func_name}'. Expected one of "
                    f"{list(reward_funcs_registry.keys())} or a valid import path."
                )

tcapelle · 2025-06-24T14:00:55Z

As of right now, if you pass reward funcs they don't get passed to the Trainer.

shirinyamani · 2025-06-25T14:26:51Z

@tcapelle correct!

HuggingFaceDocBuilderDev · 2025-06-25T14:30:34Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Co-authored-by: Shirin Yamani <75791599+shirinyamani@users.noreply.github.com>

typo

7709e48

Merge branch 'main' into fix-grpo

93a41e3

shirinyamani changed the title ~~GRPO script error~~ ☕️ GRPO script reward_funcs error Jun 25, 2025

shirinyamani enabled auto-merge (squash) June 25, 2025 14:46

shirinyamani self-requested a review June 25, 2025 14:46

shirinyamani approved these changes Jun 25, 2025

View reviewed changes

shirinyamani merged commit 0336e4b into huggingface:main Jun 25, 2025
9 of 10 checks passed

marcandrelarochelle pushed a commit to marcandrelarochelle/trl that referenced this pull request Jul 29, 2025

☕️ GRPO script reward_funcs error (huggingface#3639)

77a2a44

Co-authored-by: Shirin Yamani <75791599+shirinyamani@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

☕️ GRPO script reward_funcs error #3639

☕️ GRPO script reward_funcs error #3639

Uh oh!

tcapelle commented Jun 24, 2025

Uh oh!

shirinyamani commented Jun 24, 2025

Uh oh!

tcapelle commented Jun 24, 2025

Uh oh!

shirinyamani commented Jun 25, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jun 25, 2025

Uh oh!

Uh oh!

Uh oh!

☕️ GRPO script reward_funcs error #3639

☕️ GRPO script reward_funcs error #3639

Uh oh!

Conversation

tcapelle commented Jun 24, 2025

Uh oh!

shirinyamani commented Jun 24, 2025

Uh oh!

tcapelle commented Jun 24, 2025

Uh oh!

shirinyamani commented Jun 25, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jun 25, 2025

Uh oh!

Uh oh!

Uh oh!