[misc] feat: spport rmpad/data-packing in FSDP with transformers #91

PeterSH6 · 2025-01-10T05:06:33Z

Use actor_rollout_ref.model.use_rmpad=True + critic.model.use_rmpad=True \ + reward_model.model.use_rmpad=True to enable rmpad for different models. Default set to False
Using AutoModelForTokenClassification for Value and Reward Model. Instead of using SeqenceClassification
Compute logprob convert to log_probs_from_logits_response_rmpad

Resolve: #53

Comparison using DeepSeek7b and GSM8k:
About 1.7x speedup compare to no rmpad (original cases)

verl/workers/actor/dp_actor.py

vermouth1992 · 2025-01-10T05:33:34Z

Shall we add a supported model list and raise error if the model is not in the list?

vermouth1992 · 2025-01-10T05:37:29Z

Try to avoid using log_probs_from_logits_response_rmpad because there is an unpad op inside. unpad is a cuda-blocking op. Instead, we can directly use unpad input_ids from the input

PeterSH6 · 2025-01-10T05:40:42Z

I think this list depends on transformers lib. No sure where to get this list. I didn't find any doc about the feature in transformers.

vermouth1992 · 2025-01-10T05:42:10Z

I think this list depends on transformers lib. No sure where to get this list. I didn't find any doc about the feature in transformers.

Simply add potential models in the CI. If the model passes CI, then add to the supported list. I guess we can target

Llama
Mistral
QWen
Gemma

PeterSH6 · 2025-01-10T05:42:27Z

Try to avoid using log_probs_from_logits_response_rmpad because there is an unpad op inside. unpad is a cuda-blocking op. Instead, we can directly use unpad input_ids from the input

Sure, I will write a new API for unpad input_ids

PeterSH6 · 2025-01-10T05:46:53Z

Simply add potential models in the CI. If the model passes CI, then add to the supported list. I guess we can target

Shall we add the test_transformers.py to CI? I didn't do it as I think it only depends on the transformers version and flash_attn version.

So, I guess the goal for the CI is to test whether the latest transformers + flash_attn would break our implementation

vermouth1992 · 2025-01-10T05:48:30Z

Simply add potential models in the CI. If the model passes CI, then add to the supported list. I guess we can target

Shall we add the test_transformers.py to CI? I didn't do it as I think it only depends on the transformers version and flash_attn version.

So, I guess the goal for the CI is to test whether the latest transformers + flash_attn would break our implementation

After this PR, we should set a minimum version of transformers

verl/utils/torch_functional.py

verl/workers/fsdp_workers.py

tests/rollout/run_fsdp_vllm.py

verl/workers/actor/dp_actor.py

jeremyyx · 2025-01-11T11:41:27Z

input_ids_rmpad, indices, cu_seqlens, max_seqlen_in_batch = unpad_input(
ValueError: too many values to unpack (expected 4)
When using hh-rlhf with rmpadding=True

vermouth1992 · 2025-01-11T11:44:54Z

Could you try flash-attn < 2.7? I guess flash-attn 2.7 changes the API. @PeterSH6 We should either enforce the version in setup.py or adapt the API for flash-attn >= 2.7

vermouth1992 · 2025-01-11T11:46:06Z

Or we can move the unpad logic into verl in case future breaks.

jeremyyx · 2025-01-11T11:47:17Z

Or we can move the unpad logic into verl in case future breaks.

Thanks for your help! :)
My flash_attn version is 2.7.2. I will try to downgrade it.

PeterSH6 · 2025-01-11T11:52:17Z

I think we should deal with both cases: greater than 2.7 and lower.

A utility function would be enough for the two cases

jeremyyx · 2025-01-11T12:38:43Z

from verl.utils.torch_functional import prepare_input_for_rmpad
ImportError: cannot import name 'prepare_input_for_rmpad' from 'verl.utils.torch_functional' (/verl_0111/verl/utils/torch_functional.py)

I find that thie function is not used. And not in the codebase. Maybe should just delete this line?

vermouth1992 · 2025-01-11T12:43:55Z

Sure, simply delete it. Sorry for the inconvenience

jeremyyx · 2025-01-11T12:45:38Z

It's OK. And there is another error I need help.
File "/home/verl_0111/verl/single_controller/ray/base.py", line 399, in func
return getattr(self.worker_dict[key], name)(*args, **kwargs)
File "/home/verl_0111/verl/single_controller/base/decorator.py", line 404, in inner
return func(*args, **kwargs)
File "/home/verl_0111/verl/workers/fsdp_workers.py", line 853, in compute_rm_score
token_level_scores = self._expand_to_token_level(data, scores)
File "/home/verl_0111/verl/workers/fsdp_workers.py", line 775, in _expand_to_token_level
token_level_scores[torch.arange(batch_size), eos_mask_idx] = scores
RuntimeError: shape mismatch: value tensor of shape [128, 2048] cannot be broadcast to indexing result of shape [128]

vermouth1992 · 2025-01-11T12:46:10Z

tests/e2e/run_ray_trainer_rmpad.sh

+
+set -e -x
+
+python3 tests/e2e/arithmetic_sequence/rl/main_trainer.py \


Why this CI only contains critic rmpad but not the actor?

This is due to the misalignment between digit completion task and the Qwen model. It may have some bug when using CharTokenizer

vermouth1992 · 2025-01-11T12:46:41Z

You are using an actual reward model?

vermouth1992 · 2025-01-11T12:47:52Z

Could you toggle off reward model remove-padding for now? I guess we lack CI for model-based reward function.

jeremyyx · 2025-01-11T12:48:18Z

OK. Thanks~~~

vermouth1992 · 2025-01-11T12:51:49Z

Also, I guess you have to change AutoModelForTokenClassification to AutoModelForSequenceClassification in order to make non-rmpad version work. We will fix this as soon as possible and add a CI for this case.

…cengine#91) * init commit of rmpad * add rmpad test * support rmpad in actor model * add test for value model * support rmpad in critic and rm * fix actor return and fix num_labels and clean not used rmpad * fix critic and benchmark * update script * fix critic * lint * fix util issue * fix unnecessary unpad * address issues * fix args * update test and update rmpad support model list * fix typo * fix typo and fix name * rename rmpad to rename padding * fix arch to model_type * add ci for e2e rmpad and fix typo * lint * fix ci * fix typo * update tests for customize tokenizer in actor * fix rmpad test * update requirement of transformers as hf_rollout may have issue

…ne#91) Co-authored-by: ZYHowell <yhzhuang@cmu.edu>

PeterSH6 added 10 commits January 9, 2025 14:11

init commit of rmpad

39f0d5d

add rmpad test

b1d7e80

support rmpad in actor model

4a23636

add test for value model

666a0c0

support rmpad in critic and rm

3b3911e

fix actor return and fix num_labels and clean not used rmpad

2863e97

fix critic and benchmark

dcf799f

update script

52485ff

fix critic

92f674e

lint

e268c77

PeterSH6 requested review from vermouth1992 and eric-haibin-lin January 10, 2025 05:06

fix util issue

081c1cc

vermouth1992 reviewed Jan 10, 2025

View reviewed changes

verl/workers/actor/dp_actor.py Outdated Show resolved Hide resolved

fix unnecessary unpad

8403fd5

vermouth1992 reviewed Jan 10, 2025

View reviewed changes

verl/utils/torch_functional.py Show resolved Hide resolved

PeterSH6 added 4 commits January 10, 2025 17:05

address issues

c39a93f

fix args

7e39c47

update test and update rmpad support model list

13d3755

fix typo

3e57836

vermouth1992 reviewed Jan 10, 2025

View reviewed changes

verl/workers/fsdp_workers.py Outdated Show resolved Hide resolved

vermouth1992 reviewed Jan 10, 2025

View reviewed changes

verl/workers/fsdp_workers.py Outdated Show resolved Hide resolved

fix typo and fix name

d88204a

PeterSH6 added 2 commits January 11, 2025 11:42

rename rmpad to rename padding

234dbdd

fix arch to model_type

b59a78f

eric-haibin-lin reviewed Jan 11, 2025

View reviewed changes

tests/rollout/run_fsdp_vllm.py Outdated Show resolved Hide resolved

verl/workers/actor/dp_actor.py Show resolved Hide resolved

PeterSH6 added 8 commits January 11, 2025 14:02

add ci for e2e rmpad and fix typo

406bb2a

lint

ef9362e

fix ci

35f34c7

fix typo

9d1d4ce

update tests for customize tokenizer in actor

9b36757

fix rmpad test

cba3cec

update requirement of transformers as hf_rollout may have issue

d3102c7

Merge branch 'main' into gm/rmpad

e8fec75

vermouth1992 approved these changes Jan 11, 2025

View reviewed changes

vermouth1992 merged commit 569210e into volcengine:main Jan 11, 2025
7 checks passed

vermouth1992 reviewed Jan 11, 2025

View reviewed changes

PeterSH6 mentioned this pull request Jan 16, 2025

[Roadmap] veRL Development Roadmap #22

Open

33 tasks

kaiyliu pushed a commit to kaiyliu/knowl_verl that referenced this pull request Jun 27, 2025

[fix] Update fsdp save checkpoint to avoid permission issue (volcengi…

9dfb370

…ne#91) Co-authored-by: ZYHowell <yhzhuang@cmu.edu>


		set -e -x

		python3 tests/e2e/arithmetic_sequence/rl/main_trainer.py \

[misc] feat: spport rmpad/data-packing in FSDP with transformers #91

[misc] feat: spport rmpad/data-packing in FSDP with transformers #91

Uh oh!

Conversation

PeterSH6 commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

vermouth1992 commented Jan 10, 2025

Uh oh!

vermouth1992 commented Jan 10, 2025

Uh oh!

PeterSH6 commented Jan 10, 2025

Uh oh!

vermouth1992 commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PeterSH6 commented Jan 10, 2025

Uh oh!

PeterSH6 commented Jan 10, 2025

Uh oh!

vermouth1992 commented Jan 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jeremyyx commented Jan 11, 2025

Uh oh!

vermouth1992 commented Jan 11, 2025

Uh oh!

vermouth1992 commented Jan 11, 2025

Uh oh!

jeremyyx commented Jan 11, 2025

Uh oh!

PeterSH6 commented Jan 11, 2025

Uh oh!

jeremyyx commented Jan 11, 2025

Uh oh!

vermouth1992 commented Jan 11, 2025

Uh oh!

jeremyyx commented Jan 11, 2025

Uh oh!

vermouth1992 Jan 11, 2025

Choose a reason for hiding this comment

Uh oh!

PeterSH6 Jan 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vermouth1992 commented Jan 11, 2025

Uh oh!

vermouth1992 commented Jan 11, 2025

Uh oh!

jeremyyx commented Jan 11, 2025

Uh oh!

vermouth1992 commented Jan 11, 2025

Uh oh!

Uh oh!

PeterSH6 commented Jan 10, 2025 •

edited

Loading

vermouth1992 commented Jan 10, 2025 •

edited

Loading

PeterSH6 Jan 11, 2025 •

edited

Loading