Ray stops respecting non-kwargs arguments under certain circumstances

**Repro**
1. change
https://github.com/NVIDIA-NeMo/RL/blob/cfb803d14fea633679c03b9a86f86f83ec8d80c4/nemo_rl/distributed/worker_groups.py#L781-L783 to 
```
future = getattr(worker, method_name).remote(worker_data, **common_kwargs)
```
2. change https://github.com/NVIDIA-NeMo/RL/blob/cfb803d14fea633679c03b9a86f86f83ec8d80c4/nemo_rl/models/policy/megatron_policy_worker.py#L1053-L1055 to 
```
def get_reference_policy_logprobs(
    self, data: BatchedDataDict[Any], micro_batch_size: Optional[int] = None
) -> BatchedDataDict[ReferenceLogprobOutputSpec]:
```
3. run `uv run python examples/run_grpo_math.py --config examples/configs/grpo_math_1B_megatron.yaml`

**Trackback**
```
▶ Computing logprobs...
Traceback (most recent call last):
  File "/home/scratch.yukih_gpu/depot/reinforcer/examples/run_grpo_math.py", line 335, in <module>
    main()
  File "/home/scratch.yukih_gpu/depot/reinforcer/examples/run_grpo_math.py", line 318, in main
    grpo_train(
  File "/home/scratch.yukih_gpu/depot/reinforcer/nemo_rl/algorithms/grpo.py", line 660, in grpo_train
    fprop_logprobs = policy.get_logprobs(train_data)["logprobs"]
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.yukih_gpu/depot/reinforcer/nemo_rl/models/policy/lm_policy.py", line 208, in get_logprobs
    futures = self.worker_group.run_all_workers_sharded_data(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.yukih_gpu/depot/reinforcer/nemo_rl/distributed/worker_groups.py", line 830, in run_all_workers_sharded_data
    future = getattr(worker, method_name).remote(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.yukih_gpu/depot/reinforcer/.venv/lib/python3.12/site-packages/ray/actor.py", line 216, in remote
    return self._remote(args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.yukih_gpu/depot/reinforcer/.venv/lib/python3.12/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.yukih_gpu/depot/reinforcer/.venv/lib/python3.12/site-packages/ray/util/tracing/tracing_helper.py", line 422, in _start_span
    return method(self, args, kwargs, *_args, **_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.yukih_gpu/depot/reinforcer/.venv/lib/python3.12/site-packages/ray/actor.py", line 376, in _remote
    return invocation(args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.yukih_gpu/depot/reinforcer/.venv/lib/python3.12/site-packages/ray/actor.py", line 357, in invocation
    return actor._actor_method_call(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.yukih_gpu/depot/reinforcer/.venv/lib/python3.12/site-packages/ray/actor.py", line 1496, in _actor_method_call
    list_args = signature.flatten_args(function_signature, args, kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.yukih_gpu/depot/reinforcer/.venv/lib/python3.12/site-packages/ray/_private/signature.py", line 126, in flatten_args
    validate_args(signature_parameters, args, kwargs)
  File "/home/scratch.yukih_gpu/depot/reinforcer/.venv/lib/python3.12/site-packages/ray/_private/signature.py", line 99, in validate_args
    raise TypeError(str(exc)) from None
TypeError: too many positional arguments
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ray stops respecting non-kwargs arguments under certain circumstances #582

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	future = getattr(worker, method_name).remote(
	data=worker_data, **common_kwargs
	)

	def get_reference_policy_logprobs(
	self, *, data: BatchedDataDict[Any], micro_batch_size: Optional[int] = None
	) -> BatchedDataDict[ReferenceLogprobOutputSpec]:

Ray stops respecting non-kwargs arguments under certain circumstances #582

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions