implement weight_norm on mps 

### 🚀 The feature, motivation and pitch

For some models the best normalization method is using weight_norm. This is especially true when training for a hardware implementation for processing high resolution images. It's not possible to use batch_norm because the batch size is one, group/layer normalization adds calculations and latency to the hardware implementation.

### Alternatives

Allowing fallback on the CPU during training with the obvious training time penalty

### Additional context

_No response_

cc @kulinseth @albanD @malfet @DenisVieriu97 @razarmehr @abhudev

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

implement weight_norm on mps #104513

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

implement weight_norm on mps #104513

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions