[Distributed] Support dp/sharding overlap in virtual pp #55651

ForFishes · 2023-07-24T09:14:16Z

PR types

New features

PR changes

Others

Description

[Distributed] Support dp/sharding overlap in virtual pp

…ual pp (PaddlePaddle#55651) * Add virtual pp and dp overlap * add sharding/dp overlap * add dp/vpp overlap * fix code * fix log

* part-3 cherry from: add check for cembedding (#55621) * part-3 fix cherry from: add check for cembedding * part-3 fix c_embedding * fix test_gpt_with_pir caused by pir * part-3 cherry from: [Distributed] Support dp/sharding overlap in virtual pp (#55651) * Add virtual pp and dp overlap * add sharding/dp overlap * add dp/vpp overlap * fix code * fix log * part-3 cherry from: [cherry-pick] Integration flash attention 2 (#56015) * [FlashAttn] add flash randomness control (#52902) * add flash randomness control * fix VLOG undefied * [WIP] Integration flash attention 2 (#55758) * Work for fa-2 padded fwd. Code to be cleaned. * Work for fa2 unpadded fwd. * Work for padded-bwd, dk get small diff on np.random.seed(0) * Anyway I pass paddle's utest, except return softmax without dropout. * Clean code. * Modify interface. * Clean code and add some check. * Easy compile for dev. * Fix ci. * Fix ci-build. * Add std c++17 option again. * Limit max job when compiling fa2. * Remove const_cast * Add fwd params, to be cleaned. * Clean code. * Add bwd params. * Clean code. * Add enforce. * Use v2.0.4 * Pass RNG state to fa2 capi * Fix review. * Add assert * Skip compile for sm less than 80. --------- Co-authored-by: Chitsing KUI <kuizhiqing@msn.com> * part-4 cherry from: fix codestyle (#56066) * part-4 cherry from(no change): Add assert for static and other plateform (#56044) * part-4 cherry-pick from: dp and sharding coexist (#56096) * dp and sharding coexist * dp * part-4 cherry from: [Distributed] Add debug information for processgroupnccl (#56441) * add debug information * fix log * fix log * add detach for pp * part-4 cherry from: [BugFix]Fix bug in paddle.device.cdua.synchronize() (#56451) * fix bug in synchronize * fix bug in synchronize * part-4 cherry from: add fused gradient (#57048) * part-4 cherry from: [Distribtued] add eager_communication_connection for eager mode in nccl (#57517) * add eager_nccl_connection * add eager_connection * add eager_connection * part-4 cherry from: Add auto growth allocator for CUDA pinned allocator (#57625) * fix h2d bandwidth * remove useless flags * fix cherrry pick #56066 * part-5 cherry from: Add allocation debug FLAGS (#57797) * Add allocation debug FLAGS * add sync after value set * refine flags * part-5 cherry from: fix softmax backward (#57971) * part-5 cherry from: [Distributed]Optimize memory in processgroup (#58299) * optimize memory in processgroupnccl * optimize memory in processgroupnccl * optimize memory in processgroupnccl * optimize memory in processgroupnccl * part-5 cherry from: [Distributed]Add unbalance batch for virtual pp (#58383) * add unbalanced batch for vpp * add unbalanced batch for vpp * add unbalanced batch for vpp * fix * fix comments * fix kunlun compatibility issues * fix test_fused_rotary_position_embedding.py * fix allocator.h * tinyfix * fix conflicts * fix new ir translator c_embedding failure --------- Co-authored-by: ShenLiang <1422485404@qq.com> Co-authored-by: umiswing <umiswing@foxmail.com> Co-authored-by: Chitsing KUI <kuizhiqing@msn.com> Co-authored-by: niuliling123 <51102941+niuliling123@users.noreply.github.com> Co-authored-by: liuzhenhai93 <liuzhenhai93@outlook.com> Co-authored-by: sneaxiy <32832641+sneaxiy@users.noreply.github.com>

ForFishes added 5 commits July 24, 2023 17:12

Add virtual pp and dp overlap

0c25942

add sharding/dp overlap

f193da9

add dp/vpp overlap

ced1665

fix code

30ff263

fix log

f7b4b0c

sneaxiy approved these changes Jul 26, 2023

View reviewed changes

ForFishes merged commit f275ad2 into PaddlePaddle:incubate/new_frl Jul 26, 2023

ForFishes deleted the add_dp_overlap branch July 26, 2023 03:39

FeixLiu added a commit to FeixLiu/Paddle that referenced this pull request Aug 8, 2023

cherry pick PaddlePaddle#55651 and PaddlePaddle#55890

ed4fc72

FeixLiu mentioned this pull request Aug 8, 2023

[cherry-pick] [Distributed] Support dp/sharding overlap in virtual pp #56063

Merged

FeixLiu added a commit that referenced this pull request Aug 9, 2023

cherry pick #55651 and #55890 (#56063)

fa87884

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Distributed] Support dp/sharding overlap in virtual pp #55651

[Distributed] Support dp/sharding overlap in virtual pp #55651

Uh oh!

ForFishes commented Jul 24, 2023

Uh oh!

Uh oh!

[Distributed] Support dp/sharding overlap in virtual pp #55651

[Distributed] Support dp/sharding overlap in virtual pp #55651

Uh oh!

Conversation

ForFishes commented Jul 24, 2023

PR types

PR changes

Description

Uh oh!

Uh oh!