Clarify that FLOAT4E2M1 can be in int32_data #6640

justinchuby · 2025-01-13T20:16:51Z

Description

Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage.
Updated text in the spec for int32_data for better readability and accuracy.

Motivation and Context

Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version.

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

justinchuby · 2025-01-13T20:18:33Z

@yuanyao-nv Please review and verify that the changes are desired and accurate. Thanks!

codecov · 2025-01-13T20:23:24Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 57.46%. Comparing base (5ab1e42) to head (00668ee).
Report is 6 commits behind head on main.

✅ All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6640      +/-   ##
==========================================
+ Coverage   57.44%   57.46%   +0.02%     
==========================================
  Files         507      507              
  Lines       31612    31584      -28     
  Branches     3058     3046      -12     
==========================================
- Hits        18158    18151       -7     
+ Misses      12610    12607       -3     
+ Partials      844      826      -18

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

gramalingam · 2025-01-14T19:45:35Z

Thanks for the clarification.

The int4 packing description clarifies packing 2 int4 into 1 byte, but it doesn't clarify whether such bytes are then packed into int32 (4 each except for the last one) or each byte encoded as a single int32 or whether both are allowed?

I guess the same question applies to float8 as well, but the previous description seems to suggest that a single float8 is represented as a single int32 ...

justinchuby · 2025-01-14T20:20:39Z

I understand it as each byte encoded as a single int32. Please feel free to suggest wording. @yuanyao-nv do you have comments or further clarifications?

yuanyao-nv · 2025-01-14T21:41:46Z

Thanks for the clarification.

The int4 packing description clarifies packing 2 int4 into 1 byte, but it doesn't clarify whether such bytes are then packed into int32 (4 each except for the last one) or each byte encoded as a single int32 or whether both are allowed?

I guess the same question applies to float8 as well, but the previous description seems to suggest that a single float8 is represented as a single int32 ...

I agree it's not clear from the proto docstring whether bytes are then packed into int32. But looking at the helper function implementations (eg numpy_helper.py::_to_array()) each int32_data only holds one byte's worth of data.

gramalingam · 2025-01-15T01:34:04Z

Thanks for the clarification.
The int4 packing description clarifies packing 2 int4 into 1 byte, but it doesn't clarify whether such bytes are then packed into int32 (4 each except for the last one) or each byte encoded as a single int32 or whether both are allowed?
I guess the same question applies to float8 as well, but the previous description seems to suggest that a single float8 is represented as a single int32 ...

I agree it's not clear from the proto docstring whether bytes are then packed into int32. But looking at the helper function implementations (eg numpy_helper.py::_to_array()) each int32_data only holds one byte's worth of data.

Wondering about its efficiency. Looking at this description of protobuf encoding, I assume that this ends up getting represented as a LEN encoding of varints, is that right? This means that a byte-value <= 127 is encoded using 1 byte, but a value >= 128 will use 2 bytes. Assuming both are equally likely, we could end up using 1.5 bytes for each byte (2 INT4s) on average. But if we pack 4 bytes (each with two INT4s) into an INT32, then that int32 value will likely need 5 bytes (most of the time) or 4 bytes in the varint form.

Ok, so there is a difference (may be 5 bytes vs 6 bytes to represent 8 INT4s). Not huge, but 15% difference could be significant for GB-size tensors. So, maybe we should allow both?

(What would likely be even better is packing 7 INT4s into an int32, which can always be represented using 4 bytes ... but that would impact the packing/unpacking logic more significantly than the above.)

But, looking into it more, I guess raw_data would be the best anyway for representing the packed INT4? If so, maybe we don't have to worry about this, but remind users that raw_data preferable for this case

justinchuby · 2025-01-15T19:44:16Z

From the protobuf docs, https://protobuf.dev/programming-guides/encoding/#packed yes it seems reasonable to say that raw_data is most reasonable, because that gives us 4bytes/8 int4s right? So that should be the most efficient?

yuanyao-nv · 2025-01-15T20:10:13Z

I agree with Rama's assessment as well, that raw should be more efficient since no extra byte is needed.
And it seems like the only way ONNX exposes control over the use of int32_data vs raw_data is by setting the raw=True flag in make_tensor()?

justinchuby · 2025-01-15T20:29:24Z

I agree with Rama's assessment as well, that raw should be more efficient since no extra byte is needed. And it seems like the only way ONNX exposes control over the use of int32_data vs raw_data is by setting the raw=True flag in make_tensor()?

That's reasonable. There's only one toggle in the helper apis. I think simplicity helps here.

justinchuby · 2025-01-16T15:29:20Z

Please feel free to suggest wording. I tried to rephrase but wasn't happy with what I got. I was trying to convey that the all values should be reinterpreted as int32 without further packing.

onnx/onnx-ml.proto

justinchuby · 2025-01-22T19:59:18Z

I used chat gpt's help in rephrasing it

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

justinchuby · 2025-01-22T20:01:51Z

@yuanyao-nv take a final look? Thanks!

yuanyao-nv · 2025-01-22T21:20:43Z

LGTM. Thanks!

### Description Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage. Updated text in the spec for int32_data for better readability and accuracy. ### Motivation and Context Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version. --------- Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com> Signed-off-by: titaiwangms <titaiwang@microsoft.com>

### Description Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage. Updated text in the spec for int32_data for better readability and accuracy. ### Motivation and Context Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version. --------- Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com> Signed-off-by: Andreas Fehlner <fehlner@arcor.de>

### Description Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage. Updated text in the spec for int32_data for better readability and accuracy. ### Motivation and Context Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version. --------- Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com> Signed-off-by: seungwoo-ji <seungwoo.ji@nuvilab.com>

justinchuby requested a review from a team as a code owner January 13, 2025 20:16

Clarify that FLOAT4E2M1 can be in int32_data

bf8266c

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

justinchuby force-pushed the justinchu/int32-data branch from 2275006 to bf8266c Compare January 13, 2025 20:17

justinchuby requested review from gramalingam and yuanyao-nv January 13, 2025 20:18

justinchuby added this to the 1.18 milestone Jan 13, 2025

justinchuby added the topic: spec clarification Clarification of the ONNX spec needed label Jan 13, 2025

justinchuby temporarily deployed to testpypi_onnxweekly January 13, 2025 20:18 — with GitHub Actions Inactive

justinchuby temporarily deployed to pypi-weekly January 13, 2025 20:18 — with GitHub Actions Inactive

justinchuby added the auto update doc Generate md/proto files automatically using the CI pipeline label Jan 13, 2025

justinchuby temporarily deployed to pypi-weekly January 13, 2025 20:25 — with GitHub Actions Inactive

justinchuby temporarily deployed to testpypi_onnxweekly January 13, 2025 20:25 — with GitHub Actions Inactive

justinchuby closed this Jan 13, 2025

justinchuby reopened this Jan 13, 2025

yuanyao-nv approved these changes Jan 13, 2025

View reviewed changes

justinchuby removed the auto update doc Generate md/proto files automatically using the CI pipeline label Jan 13, 2025

update

0333370

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

justinchuby requested a review from xadupre January 14, 2025 01:04

yuanyao-nv reviewed Jan 17, 2025

View reviewed changes

onnx/onnx-ml.proto Outdated Show resolved Hide resolved

gramalingam approved these changes Jan 22, 2025

View reviewed changes

int32 encoding wording

00668ee

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

justinchuby force-pushed the justinchu/int32-data branch from 587231e to 00668ee Compare January 22, 2025 20:00

justinchuby requested a review from yuanyao-nv January 22, 2025 20:43

yuanyao-nv approved these changes Jan 22, 2025

View reviewed changes

justinchuby enabled auto-merge January 22, 2025 21:23

justinchuby removed the request for review from xadupre January 22, 2025 21:23

justinchuby added this pull request to the merge queue Jan 22, 2025

justinchuby added the release notes Important changes to call out in release notes label Jan 22, 2025

justinchuby temporarily deployed to testpypi_onnxweekly January 22, 2025 21:28 — with GitHub Actions Inactive

justinchuby temporarily deployed to pypi-weekly January 22, 2025 21:28 — with GitHub Actions Inactive

Merged via the queue into main with commit e292b4a Jan 22, 2025
43 checks passed

justinchuby deleted the justinchu/int32-data branch January 22, 2025 21:44

botantony mentioned this pull request May 14, 2025

onnx 1.18.0 Homebrew/homebrew-core#223395

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clarify that FLOAT4E2M1 can be in int32_data #6640

Clarify that FLOAT4E2M1 can be in int32_data #6640

Uh oh!

justinchuby commented Jan 13, 2025

Uh oh!

justinchuby commented Jan 13, 2025

Uh oh!

codecov bot commented Jan 13, 2025 •

edited

Loading

Uh oh!

gramalingam commented Jan 14, 2025

Uh oh!

justinchuby commented Jan 14, 2025 •

edited

Loading

Uh oh!

yuanyao-nv commented Jan 14, 2025

Uh oh!

gramalingam commented Jan 15, 2025 •

edited

Loading

Uh oh!

justinchuby commented Jan 15, 2025

Uh oh!

yuanyao-nv commented Jan 15, 2025

Uh oh!

justinchuby commented Jan 15, 2025 •

edited

Loading

Uh oh!

justinchuby commented Jan 16, 2025

Uh oh!

Uh oh!

justinchuby commented Jan 22, 2025

Uh oh!

justinchuby commented Jan 22, 2025

Uh oh!

yuanyao-nv commented Jan 22, 2025

Uh oh!

Uh oh!

Uh oh!

Clarify that FLOAT4E2M1 can be in int32_data #6640

Clarify that FLOAT4E2M1 can be in int32_data #6640

Uh oh!

Conversation

justinchuby commented Jan 13, 2025

Description

Motivation and Context

Uh oh!

justinchuby commented Jan 13, 2025

Uh oh!

codecov bot commented Jan 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gramalingam commented Jan 14, 2025

Uh oh!

justinchuby commented Jan 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuanyao-nv commented Jan 14, 2025

Uh oh!

gramalingam commented Jan 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

justinchuby commented Jan 15, 2025

Uh oh!

yuanyao-nv commented Jan 15, 2025

Uh oh!

justinchuby commented Jan 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

justinchuby commented Jan 16, 2025

Uh oh!

Uh oh!

justinchuby commented Jan 22, 2025

Uh oh!

justinchuby commented Jan 22, 2025

Uh oh!

yuanyao-nv commented Jan 22, 2025

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jan 13, 2025 •

edited

Loading

justinchuby commented Jan 14, 2025 •

edited

Loading

gramalingam commented Jan 15, 2025 •

edited

Loading

justinchuby commented Jan 15, 2025 •

edited

Loading