Skip to content

[RFC] FP6 quantization data type #7048

@TedThemistokleous

Description

@TedThemistokleous

System information

Currently using 1.18 (to get support up to fp4)

What is the problem that this feature solves?

Missing fp6 support to be able to fully support reduced precision data types (fp8, fp6, fp4) for model inference and training.

AMD currently uses fp4 and fp8 in our models and support but there isn't any fp6 support to leverage from Onnx to the ROCm Stack. MIGraphX aims to leverage fp6 quantization and quantized models for our development and customers

Alternatives considered

None known right now.

Describe the feature

Required for external users to leverage the datatype support across other projects as well

I'm actually looking to leverage ONNX functionality for MIGraphX and Onnxruntime for both our MIGraphX and MIGraphX Execution providers but it seems ONNX support for this is missing entirely. We cannot parse in fp6 if that doesn't exist within the ONNX spec. The ROCm stack will likely need support for this as well once fp6 datatype is available

Will this influence the current api (Y/N)?

Yes - Adding an additional datatype to your project

Feature Area

training, test, operators, model usage, backend. Not sure about converters

Not sure if you would convert fp6 to fp8 but interleave things? Likely would be non trivial here.

Are you willing to contribute it (Y/N)

Yes

Notes

I'm a developer who works for AMD and am part of the MIGraphX project. I also am lead developer on our Onnxruntime support efforts as well (MIGraphX Execution Provider and ROCm Execution Provider)

I'd be more than glad to help with a usecase if I can get a quantized model, or information to get things setup.

My goal here is to ensure there is end to end support between ONNX and ROCm for this new data type and things work between Onnx->MIGraphX->Onnxruntime

To get Onnxruntime to work, I would need to add the protobuf implimentation to enable the type and we have MIGraphX devs who can help with the MIGraphX IR side once and ONNX fp6 types are available.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions