-
Notifications
You must be signed in to change notification settings - Fork 5.8k
【Hackathon 6th No.35】support kwargs for recompute when use_reentrant == True -part #63337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
@ForFishes @MarioLulab CI 都问题了,麻烦研发老师 review 一下 ~ |
您好,这个pr涉及到一些问题,内部需要进一步讨论这个问题。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…== True (PaddlePaddle#63337) * support kwargs for recompute when open use_reentrant * update test * fix * Update recompute.py * fix * fix
input_args = args | ||
# rearrange `position-args + keyword-args` into `position-args` | ||
if isinstance(function, paddle.nn.Layer): | ||
dyfunc_sig = inspect.signature(function.forward) | ||
else: | ||
dyfunc_sig = inspect.signature(function) | ||
|
||
bound_args = dyfunc_sig.bind(*args, **kwargs) | ||
bound_args.apply_defaults() | ||
input_args = list(bound_args.arguments.values()) | ||
return RecomputeFunction.apply(function, preserve, *input_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对于如下情况
# 摘自 PaddleMIX:https://github.com/PaddlePaddle/PaddleMIX/blob/8b896d533811a3500af3064c5f1952b77003d4c8/ppdiffusers/ppdiffusers/models/unet_2d_blocks.py#L1149-L1155
def custom_forward(*inputs):
...
使用 bound_args.arguments
是错误的,无论传入多少个值,bound_args.arguments
只有一个值,就是打包后的 inputs
需要考虑所有 Parameter kind
import inspect
def custom_forward(*inputs, **kwargs):
return inputs
def convert_inputs_to_positional_args(fn, *args, **kwargs):
positional_args = []
sig = inspect.signature(fn)
bound_args = sig.bind(*args, **kwargs)
bound_args.apply_defaults()
for arg, param in zip(bound_args.arguments.values(), sig.parameters.values()):
if param.kind == param.VAR_POSITIONAL:
positional_args.extend(arg)
elif param.kind in (param.POSITIONAL_ONLY, param.POSITIONAL_OR_KEYWORD):
positional_args.append(arg)
elif param.kind == param.VAR_KEYWORD:
positional_args.extend(arg.values())
elif param.kind == param.KEYWORD_ONLY:
raise ValueError("Currently, keyword-only arguments are not supported.")
else:
raise ValueError("Unknown parameter kind.")
return positional_args
convert_inputs_to_positional_args(custom_forward, 1, 2, y=2, x=1)
主要思路为 将
position-args
+keyword-args
重排成position-args
注意该方案天生不支持 keyword-only 的函数,如果需要支持那么这个方案是不可行的
另外,本 PR 已经影响了高优监控模型 Stable Diffusion,我先提一个 PR 尝试 revert(#63637),可以同时看看怎么修复
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,收到
另外有一个问题想确认一下,对 Stable Diffusion 的影响是上述的 case 发生报错,还是其他问题呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
上述 case
PR Category
Auto Parallel
PR Types
Improvements
Description
当下 use_reentrant == True 时会使用 PyLayer 来实现。但 PyLayer 目前不支持以 dict 形式传入 Tensor 类型参数(因为以 dict 形式传入的 Tensor 不会创建反向节点、反向边)
为了提升分布式训练的易用性,本 PR支持当 use_reentrant == True 时 recompute 使用 dict 形式传入 Tensor 类型参数。主要思路为 将
position-args + keyword-args
重排成position-args
性能测试数据如下:
测试环境:4 卡 3090,Llama2 模型 num_hidden_layer hack 为 4
收集第30个step的性能数据:
Llama2 测试脚本如下:
修改方式
paddlenlp/transformers/llama/modeling_auto.py
中所有启用recompute
的地方(一共3处)GPT 运行脚本如下:
paddlenlp/transformers/gpt/modeling_auto.py
修改如下: