Skip to content

Create in-memory large models without serializing large initializers through protobuf #5685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 38 commits into from
Nov 9, 2023

Conversation

xadupre
Copy link
Contributor

@xadupre xadupre commented Oct 18, 2023

Description

This PR proposes a way to build a large model (> 2Gb) in memory.

Motivation and Context

The API for external data requires to create the whole model and then to call a function to move big initializer to external files. It creates class LargeModelContainer. It holds a ModelProto with no big initializers. The big initiliazers are in an additional dictionary with no protobuf structure (so it can be higher than 2Gb).

As an example, the creation of a large model:

X = make_tensor_value_info("X", TensorProto.FLOAT, [None, None])
Y = make_tensor_value_info("Y", TensorProto.FLOAT, [None])
graph = make_graph(
    [
        make_node("MatMul", ["X", "A"], ["XA"]),
        make_node("MatMul", ["XA", "B"], ["XB"]),
        make_node("MatMul", ["XB", "C"], ["Y"]),
    ],
    "mm",
    [X],
    [Y],
    [
        # first large tensor, only the type and shape are used,
        # the location must start with '#' and be unique.
        make_large_tensor_proto("#loc0", "A", TensorProto.FLOAT, (3, 3)),
        from_array(np.arange(9).astype(np.float32).reshape((-1, 3)), name="B"),
        # second large tensor, only the type and shape are used
        make_large_tensor_proto("#loc1", "C", TensorProto.FLOAT, (3, 3)),
    ],
)
onnx_model = make_model(graph)

# The second parameter is a dictionary mapping the locations (or unique names) to the numpy arrays,
# it could be easily extended to support torch tensor.
large_model = make_large_model(
    onnx_model.graph,
    {
        "#loc1": (np.arange(9) * 100).astype(np.float32).reshape((-1, 3)),
        "#loc2": (np.arange(9) + 10).astype(np.float32).reshape((-1, 3)),
    },
)
large_model.check_model()

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@codecov
Copy link

codecov bot commented Oct 18, 2023

Codecov Report

Attention: 39 lines in your changes are missing coverage. Please review.

Files Coverage Δ
onnx/helper.py 64.32% <ø> (ø)
onnx/test/model_container_test.py 87.50% <87.50%> (ø)
onnx/model_container.py 81.01% <81.01%> (ø)

📢 Thoughts on this report? Let us know!

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lintrunner found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@xadupre xadupre changed the title [WIP] Proposal to store large models into a single file Create in-memory large models without serializing large initializers through protobuf Oct 25, 2023
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@gramalingam
Copy link
Contributor

Out of curiosity, does ONNX spec allow a ModelProto/GraphProto referencing a non-proto file as weight external data? Would endianess bring any compatibility issue if the model is, say, serialized in a big-endian system and loaded into a little-endian platform?

Yes and no. The ONNX spec specifies the rawdata format for the external data (so, no endianess in serialized form).

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@xadupre xadupre enabled auto-merge November 9, 2023 11:30
@xadupre xadupre added this pull request to the merge queue Nov 9, 2023
Merged via the queue into onnx:main with commit 6fa70c5 Nov 9, 2023
@xadupre xadupre deleted the lonnx branch November 9, 2023 17:05
github-merge-queue bot pushed a commit that referenced this pull request Nov 16, 2023
### Description
ReferenceEvaluator can take any proto as an input. This PR extends the
support to ModelContainer introduced in PR #5685.

### Motivation and Context
This makes it easier to test.

---------

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
Copy link

@thiagocrepaldi thiagocrepaldi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the model initializers renamed?

continue

info = ext_data.ExternalDataInfo(tensor)
file_location = ext_data._sanitize_path(info.location)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is sanitization needed? I was assuming location was sanitized before saving, so loading should be ok? Maybe the sanitization removes the leading # from location for filesystem file loading purposes?

linshokaku pushed a commit to linshokaku/onnx that referenced this pull request Oct 2, 2024
…through protobuf (onnx#5685)

### Description

This PR proposes a way to build a large model (> 2Gb) in memory.

### Motivation and Context

The API for external data requires to create the whole model and then to
call a function to move big initializer to external files. It creates
class `LargeModelContainer`. It holds a ModelProto with no big
initializers. The big initiliazers are in an additional dictionary with
no protobuf structure (so it can be higher than 2Gb).

As an example, the creation of a large model:

```python
X = make_tensor_value_info("X", TensorProto.FLOAT, [None, None])
Y = make_tensor_value_info("Y", TensorProto.FLOAT, [None])
graph = make_graph(
    [
        make_node("MatMul", ["X", "A"], ["XA"]),
        make_node("MatMul", ["XA", "B"], ["XB"]),
        make_node("MatMul", ["XB", "C"], ["Y"]),
    ],
    "mm",
    [X],
    [Y],
    [
        # first large tensor, only the type and shape are used,
        # the location must start with '#' and be unique.
        make_large_tensor_proto("#loc0", "A", TensorProto.FLOAT, (3, 3)),
        from_array(np.arange(9).astype(np.float32).reshape((-1, 3)), name="B"),
        # second large tensor, only the type and shape are used
        make_large_tensor_proto("#loc1", "C", TensorProto.FLOAT, (3, 3)),
    ],
)
onnx_model = make_model(graph)

# The second parameter is a dictionary mapping the locations (or unique names) to the numpy arrays,
# it could be easily extended to support torch tensor.
large_model = make_large_model(
    onnx_model.graph,
    {
        "#loc1": (np.arange(9) * 100).astype(np.float32).reshape((-1, 3)),
        "#loc2": (np.arange(9) + 10).astype(np.float32).reshape((-1, 3)),
    },
)
large_model.check_model()

```

---------

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
Co-authored-by: G. Ramalingam <grama@microsoft.com>
Signed-off-by: Linsho Kaku <linsho@preferred.jp>
linshokaku pushed a commit to linshokaku/onnx that referenced this pull request Oct 2, 2024
### Description
ReferenceEvaluator can take any proto as an input. This PR extends the
support to ModelContainer introduced in PR onnx#5685.

### Motivation and Context
This makes it easier to test.

---------

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
Signed-off-by: Linsho Kaku <linsho@preferred.jp>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants