Skip to content

Fix shape inference for DequantizeLinear #5709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 30, 2023
Merged

Fix shape inference for DequantizeLinear #5709

merged 2 commits into from
Oct 30, 2023

Conversation

xadupre
Copy link
Contributor

@xadupre xadupre commented Oct 27, 2023

Description

Fix shape inference for types float16 and bfloat16 for operator DequantizeLinear.

Motivation and Context

Shape inference is wrong for DequanzeLinear-19 as reported in issue #5704.

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@codecov
Copy link

codecov bot commented Oct 27, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Files Coverage Δ
onnx/test/shape_inference_test.py 99.49% <100.00%> (+<0.01%) ⬆️

📢 Thoughts on this report? Let us know!.

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@xadupre xadupre marked this pull request as ready for review October 27, 2023 14:25
@xadupre xadupre requested review from a team as code owners October 27, 2023 14:25
@xadupre xadupre added this pull request to the merge queue Oct 30, 2023
Merged via the queue into onnx:main with commit b2ee94b Oct 30, 2023
@xadupre xadupre deleted the dqshape branch October 30, 2023 17:47
isdanni pushed a commit to isdanni/onnx that referenced this pull request Nov 2, 2023
### Description
Fix shape inference for types float16 and bfloat16 for operator
DequantizeLinear.

### Motivation and Context
Shape inference is wrong for DequanzeLinear-19 as reported in issue
onnx#5704.

---------

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
xadupre added a commit to microsoft/onnxruntime that referenced this pull request Jan 12, 2024
…hts (#18043)

### Description

Whenever a node QuantizeLinear or DequantizeLinear, the type of the
weights before being quantize must be known to create the scale with the
expected type. Another option would be to add many operator CastLike but
that would push the burden to onnxruntime optimizer.

The PR tries to avoid changing the signature. To do so, it modified the
scale computation to use a numpy array to store the result and not a
python float. The numpy array must be of the same type than the weights
to quantize.

The PR adds many `assert` to check the type of the scale is not a python
type or a float64. This was added to make sure all the code follows the
same logic. These lines were kept for the first review.

DequantizeLinear, QuantizeLinear cannot be tested with onnx==1.15. PR
onnx/onnx#5709 is missing to fix shape
inference. PR onnx/onnx#5473) is missing to
support QLinearMatMul with float 16. That explains why some tests are
disabled with float 16.

### Motivation and Context

The current quantization tool assumes every weight is float 32. For
large models such as LLAMA, it is usually float 16. The quantization
needs to quantize such weights.
mszhanyi pushed a commit to microsoft/onnxruntime that referenced this pull request Jan 15, 2024
…hts (#18043)

### Description

Whenever a node QuantizeLinear or DequantizeLinear, the type of the
weights before being quantize must be known to create the scale with the
expected type. Another option would be to add many operator CastLike but
that would push the burden to onnxruntime optimizer.

The PR tries to avoid changing the signature. To do so, it modified the
scale computation to use a numpy array to store the result and not a
python float. The numpy array must be of the same type than the weights
to quantize.

The PR adds many `assert` to check the type of the scale is not a python
type or a float64. This was added to make sure all the code follows the
same logic. These lines were kept for the first review.

DequantizeLinear, QuantizeLinear cannot be tested with onnx==1.15. PR
onnx/onnx#5709 is missing to fix shape
inference. PR onnx/onnx#5473) is missing to
support QLinearMatMul with float 16. That explains why some tests are
disabled with float 16.

### Motivation and Context

The current quantization tool assumes every weight is float 32. For
large models such as LLAMA, it is usually float 16. The quantization
needs to quantize such weights.
rohan11235813 pushed a commit to quadric-io/onnxruntime that referenced this pull request Aug 19, 2025
…hts (#18043)

### Description

Whenever a node QuantizeLinear or DequantizeLinear, the type of the
weights before being quantize must be known to create the scale with the
expected type. Another option would be to add many operator CastLike but
that would push the burden to onnxruntime optimizer.

The PR tries to avoid changing the signature. To do so, it modified the
scale computation to use a numpy array to store the result and not a
python float. The numpy array must be of the same type than the weights
to quantize.

The PR adds many `assert` to check the type of the scale is not a python
type or a float64. This was added to make sure all the code follows the
same logic. These lines were kept for the first review.

DequantizeLinear, QuantizeLinear cannot be tested with onnx==1.15. PR
onnx/onnx#5709 is missing to fix shape
inference. PR onnx/onnx#5473) is missing to
support QLinearMatMul with float 16. That explains why some tests are
disabled with float 16.

### Motivation and Context

The current quantization tool assumes every weight is float 32. For
large models such as LLAMA, it is usually float 16. The quantization
needs to quantize such weights.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants