Skip to content

Clarification of Specs: LayerNormalization op should define broadcasting rules for B and Scale inputs #5666

@AlexandreEichenberger

Description

@AlexandreEichenberger

Bug Report

Is the issue related to model conversion?

no

Describe the bug

The LayerNormalization operation is defined as a sequences of ops. In this sequence, the LayerNormalization inputs 'B' and 'Scales' are used, respectively, by Add and Mul, both ops that support Multi-Directional Broadcasting.

However, in the context of LayerNormalization, I believe only the Uni-Directional Broadcasting from B and Scales to their respective other operands (with shapes derived from input X) makes sense.

Does the community agree with this observation?

If yes, I recommend to add this clarification to the current op. Let me know if you need help for this.

In general, I would suggest to be more explicit with broadcasting rules when applicable.

Notes

This is a documentation issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions