Skip to content

Conversation

xenova
Copy link
Contributor

@xenova xenova commented Nov 26, 2024

Description

While saving a model that was just barely under the 2GiB limit (using Optimum), I noticed that it was saved with the external data format, even though it should be able to be saved in a single file.

After further investigation, it was related to these lines, where the value of onnx.MAXIMUM_PROTOBUF is used to determine whether to use the external data format. This value is hardcoded to 2GB (=2*10^9 bytes), even though it should be 2GiB (=2*1024^3), according to the protobuf documentation:
image.

Unless there is a specific onnx specification which limits this to 2GB (and not 2GiB), this PR may potentially fix a problem which has existed for a long time.

Motivation and Context

Allows certain models to be saved as a single .onnx file without the external data format, which were previously saved separately due to this misconfiguration.

While saving a model that was just barely under the 2GiB limit (using Optimum), I noticed that it was saved with the external data format, even though it should be able to be .

After further investigation, it was related to https://github.com/huggingface/optimum/blob/65a8a94adaf136dd677d28cfc837c0acfe993031/optimum/onnx/graph_transformations.py#L135-L138, where the value of `onnx.MAXIMUM_PROTOBUF` is used to determine whether to use the external data format. This value is hardcoded to 2GB (=2*10^9 bytes), even though it should be 2GiB (=2*1024^3), according to the https://protobuf.dev/programming-guides/proto-limits/#:~:text=form%20must%20be-,%3C2GiB%2C,-as%20that%20is:


Signed-off-by: Joshua Lochner <admin@xenova.com>
@xenova xenova requested a review from a team as a code owner November 26, 2024 00:33
Copy link
Member

@justinchuby justinchuby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is reasonable. Thanks

Copy link

codecov bot commented Nov 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 57.39%. Comparing base (b90b98e) to head (607cba1).
Report is 240 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6556   +/-   ##
=======================================
  Coverage   57.39%   57.39%           
=======================================
  Files         507      507           
  Lines       31528    31528           
  Branches     3549     3549           
=======================================
  Hits        18096    18096           
  Misses      12583    12583           
  Partials      849      849           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@justinchuby justinchuby added this pull request to the merge queue Nov 26, 2024
Merged via the queue into onnx:main with commit 4e29344 Nov 26, 2024
38 checks passed
@ramkrishna2910 ramkrishna2910 added the module: checker onnx.checker label May 12, 2025
@ramkrishna2910 ramkrishna2910 temporarily deployed to testpypi_onnxweekly May 12, 2025 22:52 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: checker onnx.checker
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants