-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Add FLOAT4E2M1 support to relevant operators #6283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6283 +/- ##
==========================================
+ Coverage 56.95% 57.22% +0.26%
==========================================
Files 506 507 +1
Lines 30467 31398 +931
Branches 4592 4691 +99
==========================================
+ Hits 17353 17968 +615
- Misses 12285 12577 +292
- Partials 829 853 +24 ☔ View full report in Codecov by Sentry. |
Do you have plans to also push jax-ml/ml_dtypes#116 forward? If this included in ml_dyptes it would make the interop experience much better (and code run faster). |
@onnx/sig-archinfra-approvers I'm seeing the following test errors
Seems like there's some list I need to propagate the new data type to. Any pointers? |
FP4 would the more impending priority for us. The remaining FP6 types could be worked on as well in the future if bandwidth permits. |
I usually look for strings such as |
It would be very helpful to have an unpacked version of fp4e2m1 in ml_dtypes |
Is |
lintrunner errors will need to be ignore in line. For example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Thanks!
cc @gramalingam for another look |
### Description - Add FLOAT4E2M1 as a new data type to proto as well as relevant helper functions and tests. - This PR splits out the portion of #6283 relevant to data type updates to reduce the PR's size. ### Motivation and Context Narrow precision data types with sub-byte bit widths are becoming solutions to the rising cost, performance, and deployment challenges of LLMs. ONNX already has INT4/UINT4. FP4 is another commonly used narrow-precision data type for compressing both the weights and activations of LLMs. For example [OCP](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf) MXFP4 uses E2M1 as element type. Similar to INT4/UNIT4, FP4 weights/inputs are expected to be packed. Signed-off-by: Yuan Yao <yuanyao@nvidia.com>
Signed-off-by: Yuan Yao (yuanyao) <yuanyao@nvidia.com>
Signed-off-by: Yuan Yao <yuanyao@nvidia.com>
### Description - Add FLOAT4E2M1 as a new data type to proto as well as relevant helper functions and tests. - This PR splits out the portion of onnx#6283 relevant to data type updates to reduce the PR's size. ### Motivation and Context Narrow precision data types with sub-byte bit widths are becoming solutions to the rising cost, performance, and deployment challenges of LLMs. ONNX already has INT4/UINT4. FP4 is another commonly used narrow-precision data type for compressing both the weights and activations of LLMs. For example [OCP](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf) MXFP4 uses E2M1 as element type. Similar to INT4/UNIT4, FP4 weights/inputs are expected to be packed. Signed-off-by: Yuan Yao <yuanyao@nvidia.com> Signed-off-by: Andreas Fehlner <fehlner@arcor.de>
@yuanyao-nv the test failure can be reproed if you create a python 3.9 environment. build onnx from your branch, python -m pip install -r requirements-min.txt, and then: python onnx\test\test_backend_reference.py -k test_dequantizelinear_float4e2m1_cpu (make sure you numpy version is 1.20.3). A quick fix is to insert mantissa = mantissa.astype(np.float32) at onnx\numpy_helper.py before val = np.where(. Please let me know if you still have issue. I am happy to meet you on team or zoom. Thank you |
Signed-off-by: Yuan Yao <yuanyao@nvidia.com>
Thanks all for the help! @justinchuby @liqunfu @gramalingam |
### Description - Add FLOAT4E2M1 as a new data type to proto as well as relevant helper functions and tests. - This PR splits out the portion of onnx#6283 relevant to data type updates to reduce the PR's size. ### Motivation and Context Narrow precision data types with sub-byte bit widths are becoming solutions to the rising cost, performance, and deployment challenges of LLMs. ONNX already has INT4/UINT4. FP4 is another commonly used narrow-precision data type for compressing both the weights and activations of LLMs. For example [OCP](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf) MXFP4 uses E2M1 as element type. Similar to INT4/UNIT4, FP4 weights/inputs are expected to be packed. Signed-off-by: Yuan Yao <yuanyao@nvidia.com> Signed-off-by: Linsho Kaku <linsho@preferred.jp>
### Description - FLOAT4E2M1 has been added to proto in onnx#6318 - This PR adds FLOAT4E2M1 support for QuantizeLinear, DequantizeLinear, Cast, CastLike (opset 23). - Also add support to non-compute ops: Constant, ConstantOfShape, Identity, Reshape, Shape, Size, If, Loop, Scan, Flatten, Pad, Squeeze, Unsqueeze, Transpose (opset 23). Similar to INT4/UNIT4, FP4 weights/inputs are expected to be packed. --------- Signed-off-by: Yuan Yao (yuanyao) <yuanyao@nvidia.com> Signed-off-by: Yuan Yao <yuanyao@nvidia.com> Signed-off-by: Linsho Kaku <linsho@preferred.jp>
Hi just seeing this PR after as search on fp6 in your repo, are there now efforts for fp6 support added in to the onnx spec? I don't see any PR's or relevant specs that we can leverage right now on your technical site. If not, what would adding that support involve? |
@yuanyao-nv @TedThemistokleous could you share your use case? |
I don't have immediate use cases for MXFP6 at this point. But please feel free to add it if you need it. I also have a PR open for e8m0 currently #7030 which should cover the scale type for all MX formats. |
I'm a dev on MIGraphX and we're looking to support the fp6 MX type for our upcomming 7.1 ROCm Release. I've made a feature request here: I'm also the developer responsible for the MIGraphX and ROCm Execution Providers in Onnxruntime On the MIGraphX side we'd need to add in the type support once we have an idea to handle input data and able to find fp6 quantized onnx models to parse in. |
Description
Similar to INT4/UNIT4, FP4 weights/inputs are expected to be packed.