Skip to content

Conversation

heavengate
Copy link
Contributor

@heavengate heavengate commented Sep 9, 2022

PR types

New features

PR changes

Others

Describe

Add FusedMultiTransformer fuse pass for GPT3

  • add passes: fused_multi_transformer_encoder fused_multi_transformer_decoder fused_multi_transformer_encoder_fuse_qkv fused_multi_transformer_decoder_fuse_qkv

    • fused_multi_transformer_encoder
      匹配GPT encoder部分的Transformer Layer(包含multt-head attention和feed forward部分)
      1666266345
      转化为FusedMultiTransformer融合OP,如图中红框部分即为一层,用于推理加速
      infoflow 2022-10-20 19-51-42

    • fused_multi_transformer_decoder
      decoder结构与encoder类似,多了Cache KV的处理,如图中红框部分
      infoflow 2022-10-20 19-54-05
      转换为带TimeStep输入的FusedMultiTransformer融合OP,用于解码
      infoflow 2022-10-20 19-55-33

    • fused_multi_transformer_encoder/decoder_fuse_qkv
      fused_multi_transformer_encoder/decoder_fuse_qkv结构与fused_multi_transformer_encoder/decoder类似,差别为Multi-Head Attention例QKV是concat成一个Tensor计算的,计算完后通过split分开,如图中红框所示
      infoflow 2022-10-20 19-58-29

  • add multi-devices passes: multi_devices_fused_multi_transformer_encoder_fuse_qkv multi_devices_fused_multi_transformer_decoder_fuse_qkv

    • multi-devices passes
      multi-devices passes是在上述pass的基础上插入了c_identityc_allreduce_sum卡间通信OP
      infoflow 2022-10-20 20-00-49
  • add subgraph pass support for the 6 passes

  • add IR_NODE_UNLINK for unlinking nodes in Graph

  • fix Var->persistent check segment fault when Variable is null

TODO

  • 抽取可复用模块,精简代码

@paddle-bot-old paddle-bot-old bot added contributor External developers and removed contributor External developers labels Sep 13, 2022
@heavengate heavengate force-pushed the add_fused_multi_transformer_pass_fleetx branch from 552a2c5 to 0617335 Compare September 20, 2022 13:22
if (!graph->Has(kPassRecorder)) {
graph->Set<PassRecorder>(kPassRecorder, new PassRecorder);
}
graph->Get<PassRecorder>(kPassRecorder).insert(Type());

if(graph->IsMainGraph() and "graph_viz_pass"!=Type() and "graph_to_program_pass"!=Type()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks!

int fusion_count{0};
auto handler = [&](const GraphPatternDetector::subgraph_t& subgraph,
Graph* g) {
GET_IR_NODE_FROM_SUBGRAPH(input0, input0, fused_multi_transformer_pattern);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要主动调用IsCompat(subgraph, graph), AddOpCompat才能生效

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks!

qingqing01
qingqing01 previously approved these changes Oct 17, 2022
@@ -0,0 +1,416 @@
// Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2022

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done,thanks

@heavengate heavengate changed the title Add fused multi transformer pass fleetx Add FusedMultiTransformer fuse pass for GPT3 Oct 19, 2022
qingqing01
qingqing01 previously approved these changes Oct 19, 2022
LielinJiang
LielinJiang previously approved these changes Oct 19, 2022
@heavengate heavengate dismissed stale reviews from LielinJiang and qingqing01 via 4e1aba3 October 19, 2022 13:59
@heavengate heavengate force-pushed the add_fused_multi_transformer_pass_fleetx branch 2 times, most recently from 1cd0d85 to 409cf3a Compare October 20, 2022 01:14
Copy link
Member

@raindrops2sea raindrops2sea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

增补一下新加几个pass功能和效果的更具体描述。

raindrops2sea
raindrops2sea approved these changes Oct 20, 2022
Copy link
Member

@raindrops2sea raindrops2sea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

训练侧也看一下静态训练的复用问题,@sneaxiy @Xreki @gongweibao

@raindrops2sea raindrops2sea self-requested a review October 20, 2022 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants