Add FLOAT4E2M1 data type #6318

yuanyao-nv · 2024-08-24T00:43:00Z

Description

Add FLOAT4E2M1 as a new data type to proto as well as relevant helper functions and tests.
This PR splits out the portion of Add FLOAT4E2M1 support to relevant operators #6283 relevant to data type updates to reduce the PR's size.

Motivation and Context

Narrow precision data types with sub-byte bit widths are becoming solutions to the rising cost, performance, and deployment challenges of LLMs. ONNX already has INT4/UINT4. FP4 is another commonly used narrow-precision data type for compressing both the weights and activations of LLMs. For example OCP MXFP4 uses E2M1 as element type.

Similar to INT4/UNIT4, FP4 weights/inputs are expected to be packed.

Signed-off-by: Yuan Yao <yuanyao@nvidia.com>

codecov · 2024-08-24T00:47:38Z

Codecov Report

Attention: Patch coverage is 77.77778% with 22 lines in your changes missing coverage. Please review.

Project coverage is 57.26%. Comparing base (83194ed) to head (74e11de).
Report is 465 commits behind head on main.

Files with missing lines	Patch %	Lines
onnx/numpy_helper.py	58.62%	10 Missing and 2 partials ⚠️
onnx/reference/ops/op_cast.py	33.33%	7 Missing and 1 partial ⚠️
onnx/reference/op_run.py	0.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6318      +/-   ##
==========================================
+ Coverage   56.95%   57.26%   +0.30%     
==========================================
  Files         506      507       +1     
  Lines       30467    31354     +887     
  Branches     4592     4679      +87     
==========================================
+ Hits        17353    17954     +601     
- Misses      12285    12550     +265     
- Partials      829      850      +21

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

justinchuby · 2024-08-24T05:23:06Z

onnx/backend/test/data/node/test_reduce_log_sum_exp_keepdims_random/test_data_set_0/output_0.pb

@yuanyao-nv I just noticed - the test data shouldn’t change in this PR?

This was auto generated by update_doc.sh. I've seen cases where some seemingly irrelevant pb files get updated by the script. Do you understand why?

The pb files can be different in bytes when generated on different OS. I think that’s what’s happening here.

I see. I wonder if it's possible to let the pipeline auto-generate these test files instead and reduce the number of files changed in each PR.

### Description - Add FLOAT4E2M1 as a new data type to proto as well as relevant helper functions and tests. - This PR splits out the portion of onnx#6283 relevant to data type updates to reduce the PR's size. ### Motivation and Context Narrow precision data types with sub-byte bit widths are becoming solutions to the rising cost, performance, and deployment challenges of LLMs. ONNX already has INT4/UINT4. FP4 is another commonly used narrow-precision data type for compressing both the weights and activations of LLMs. For example [OCP](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf) MXFP4 uses E2M1 as element type. Similar to INT4/UNIT4, FP4 weights/inputs are expected to be packed. Signed-off-by: Yuan Yao <yuanyao@nvidia.com> Signed-off-by: Andreas Fehlner <fehlner@arcor.de>

### Description - FLOAT4E2M1 has been added to proto in #6318 - This PR adds FLOAT4E2M1 support for QuantizeLinear, DequantizeLinear, Cast, CastLike (opset 23). - Also add support to non-compute ops: Constant, ConstantOfShape, Identity, Reshape, Shape, Size, If, Loop, Scan, Flatten, Pad, Squeeze, Unsqueeze, Transpose (opset 23). Similar to INT4/UNIT4, FP4 weights/inputs are expected to be packed. --------- Signed-off-by: Yuan Yao (yuanyao) <yuanyao@nvidia.com> Signed-off-by: Yuan Yao <yuanyao@nvidia.com>

### Description - Add FLOAT4E2M1 as a new data type to proto as well as relevant helper functions and tests. - This PR splits out the portion of onnx#6283 relevant to data type updates to reduce the PR's size. ### Motivation and Context Narrow precision data types with sub-byte bit widths are becoming solutions to the rising cost, performance, and deployment challenges of LLMs. ONNX already has INT4/UINT4. FP4 is another commonly used narrow-precision data type for compressing both the weights and activations of LLMs. For example [OCP](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf) MXFP4 uses E2M1 as element type. Similar to INT4/UNIT4, FP4 weights/inputs are expected to be packed. Signed-off-by: Yuan Yao <yuanyao@nvidia.com> Signed-off-by: Linsho Kaku <linsho@preferred.jp>

### Description - FLOAT4E2M1 has been added to proto in onnx#6318 - This PR adds FLOAT4E2M1 support for QuantizeLinear, DequantizeLinear, Cast, CastLike (opset 23). - Also add support to non-compute ops: Constant, ConstantOfShape, Identity, Reshape, Shape, Size, If, Loop, Scan, Flatten, Pad, Squeeze, Unsqueeze, Transpose (opset 23). Similar to INT4/UNIT4, FP4 weights/inputs are expected to be packed. --------- Signed-off-by: Yuan Yao (yuanyao) <yuanyao@nvidia.com> Signed-off-by: Yuan Yao <yuanyao@nvidia.com> Signed-off-by: Linsho Kaku <linsho@preferred.jp>

### Description onnx/onnx#6318 and onnx/onnx#6283 added FP4 support to ONNX. This change introduces the FP4 type in ORT and adds type support to one relevant operator (`Cast`) as a proof-of-concept for the type integration into ORT. More op support will be added on a need-basis. This change took inspiration from the following PRs: #14731 #22228 #20362 Some notes: 1) Only `tensor` type gets support for FP4 initially. Secondary types like `seq(tensor)`, `sparse_tensor`, `optional` do not get support (so as to not introduce unnecessary bloat to the framework without a solid use-case) 2) Flatbuffer related files receive no updates in this PR ### Motivation and Context Be able to run FP4 models with ORT

initial commit

74e11de

Signed-off-by: Yuan Yao <yuanyao@nvidia.com>

yuanyao-nv requested review from a team as code owners August 24, 2024 00:43

yuanyao-nv changed the title ~~Add FP4E2M1 data type~~ Add FLOAT4E2M1 data type Aug 24, 2024

justinchuby approved these changes Aug 24, 2024

View reviewed changes

justinchuby added the run release CIs Use this label to trigger release tests in CI label Aug 24, 2024

justinchuby closed this Aug 24, 2024

justinchuby reopened this Aug 24, 2024

yuanyao-nv added this pull request to the merge queue Aug 24, 2024

Merged via the queue into onnx:main with commit c057d17 Aug 24, 2024
105 checks passed

yuanyao-nv deleted the dev-fp4-type-only branch August 24, 2024 03:31

justinchuby requested a review from gramalingam August 24, 2024 05:20

justinchuby reviewed Aug 24, 2024

View reviewed changes

yuanyao-nv mentioned this pull request Aug 24, 2024

Add FLOAT4E2M1 support to relevant operators #6283

Merged

lutzroeder added a commit to lutzroeder/netron that referenced this pull request Aug 24, 2024

Add ONNX test file (onnx/onnx#6318)

4a79a05

lutzroeder added a commit to lutzroeder/netron that referenced this pull request Aug 24, 2024

Add ONNX test file (onnx/onnx#6318)

cd1752e

vkuzo mentioned this pull request Feb 4, 2025

MX basic dtypes in pytorch/pytorch pytorch/pytorch#146414

Open

ramkrishna2910 added the module: spec label May 12, 2025

botantony mentioned this pull request May 14, 2025

onnx 1.18.0 Homebrew/homebrew-core#223395

Closed

MagellaX mentioned this pull request Jul 18, 2025

Add support for FP6 quantization data types (FLOAT6E2M3 and FLOAT6E3M2) addressing RFC #7048 #7145

Open

hariharans29 mentioned this pull request Aug 21, 2025

[Core] Support fp4 type in ORT microsoft/onnxruntime#25767

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add FLOAT4E2M1 data type #6318

Add FLOAT4E2M1 data type #6318

Uh oh!

yuanyao-nv commented Aug 24, 2024 •

edited

Loading

Uh oh!

codecov bot commented Aug 24, 2024 •

edited

Loading

Uh oh!

Uh oh!

justinchuby Aug 24, 2024

Uh oh!

yuanyao-nv Aug 24, 2024

Uh oh!

justinchuby Aug 24, 2024

Uh oh!

yuanyao-nv Aug 24, 2024

Uh oh!

Uh oh!

Add FLOAT4E2M1 data type #6318

Add FLOAT4E2M1 data type #6318

Uh oh!

Conversation

yuanyao-nv commented Aug 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

codecov bot commented Aug 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

justinchuby Aug 24, 2024

Choose a reason for hiding this comment

Uh oh!

yuanyao-nv Aug 24, 2024

Choose a reason for hiding this comment

Uh oh!

justinchuby Aug 24, 2024

Choose a reason for hiding this comment

Uh oh!

yuanyao-nv Aug 24, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yuanyao-nv commented Aug 24, 2024 •

edited

Loading

codecov bot commented Aug 24, 2024 •

edited

Loading