Skip to content

[FeatureRequest]bmt.OpTransformerBlockList **DO NOT** support multiple return values of transformer block's forward propogation #91

@eggiter

Description

@eggiter

1. Currently bmt.OpTransformerBlockList can only handle the hidden states returned by transformer block.

  1. Recent released flash_atten implemented transformer block returns hidden_states as well as residual in order to fuse Dropout -> Add -> LN. Additionally, the above two will be passed to the next block as input;
  2. Above case seemed not be considered by our bmt.OpTransformerBlockList and cannot be properly handled by us.

2. Request to support the above case which returns multiple values by a transformer block.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions