-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Add attribute output_dtype to QuantizeLinear #5956
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@gramalingam following up on our discussion in the Operators SIG meeting yesterday, here are the changes for #5943. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5956 +/- ##
=======================================
Coverage 56.79% 56.79%
=======================================
Files 506 506
Lines 30308 30349 +41
Branches 4580 4589 +9
=======================================
+ Hits 17214 17238 +24
- Misses 12267 12283 +16
- Partials 827 828 +1 ☔ View full report in Codecov by Sentry. |
ONNX_ASSERTM( | ||
false, | ||
"Attribute output_dtype is not supported for Opset Version %d, supply a zero-point tensor instead", | ||
target_version().version()) |
Check notice
Code scanning / CodeQL
Too many arguments to formatting function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the quick PR, greatly appreciate it!
The purpose of this change is to allow setting the quantized type without providing the zero-point tensor. This reduces model size, most importantly for block quantization where the zero-point tensor dimensions are large. It also simplifies the creation of symmetric quantization nodes. Signed-off-by: Gal Hubara Agam <ghubaraagam@nvidia.com>
Signed-off-by: Gal Hubara Agam <ghubaraagam@nvidia.com>
Signed-off-by: Gal Hubara Agam <ghubaraagam@nvidia.com>
Signed-off-by: Gal Hubara Agam <ghubaraagam@nvidia.com>
1e26f28
to
05b222a
Compare
The purpose of this change is to allow setting the quantized type without providing the zero-point tensor for symmetric quantization. This reduces model size, most importantly for block quantization where the zero-point tensor dimensions are large, and reduces backend runtime. This implements issue onnx#5943 --------- Signed-off-by: Gal Hubara Agam <ghubaraagam@nvidia.com> Signed-off-by: isdanni <leedanni@gmail.com>
The purpose of this change is to allow setting the quantized type without providing the zero-point tensor for symmetric quantization. This reduces model size, most importantly for block quantization where the zero-point tensor dimensions are large, and reduces backend runtime. This implements issue onnx#5943 --------- Signed-off-by: Gal Hubara Agam <ghubaraagam@nvidia.com> Signed-off-by: Linsho Kaku <linsho@preferred.jp>
The purpose of this change is to allow setting the quantized type without providing the zero-point tensor for symmetric quantization.
This reduces model size, most importantly for block quantization where the zero-point tensor dimensions are large, and reduces backend runtime.
This implements issue #5943