Fix pooling pads issues #6650

titaiwangms · 2025-01-22T23:15:30Z

Description

Pooling in reference currently has two bugs:

(1) it has "pads required" to make sure sliding window does not go out-of-bound, but it does not exclude pads required from pooling caculations. This causes microsoft/onnxruntime#16203

(2) In ceil_mode, the reference pooling pads the input image, but does not check if sliding window starts on pads. When we reach the end of the image, the window should stop working. This causes pytorch/pytorch#131272. Not directly, but related fix on MaxPooling: #5741. Detail: pytorch/pytorch#116420 (comment)

Motivation and Context

This PR fixes the two bugs and update their tests accordingly. I also drafted a PR to onnxruntime to have this update in CPU provider: microsoft/onnxruntime#16752

justinchuby · 2025-01-23T17:19:28Z

Is there any wording to need to update in the spec? Just checking

titaiwangms · 2025-01-23T17:23:11Z

Is there any wording to need to update in the spec? Just checking

There was a spec ask: #1922

codecov · 2025-01-23T17:32:48Z

Codecov Report

Attention: Patch coverage is 14.70588% with 29 lines in your changes missing coverage. Please review.

Project coverage is 57.49%. Comparing base (9683661) to head (4a3a166).
Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
onnx/backend/test/case/node/averagepool.py	0.00%	13 Missing ⚠️
onnx/reference/ops/op_pool_common.py	41.66%	5 Missing and 2 partials ⚠️
onnx/backend/test/case/node/lppool.py	0.00%	4 Missing ⚠️
onnx/backend/test/case/node/maxpool.py	0.00%	4 Missing ⚠️
onnx/test/test_backend_onnxruntime.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6650      +/-   ##
==========================================
- Coverage   57.50%   57.49%   -0.01%     
==========================================
  Files         507      507              
  Lines       31624    31632       +8     
  Branches     3046     3048       +2     
==========================================
+ Hits        18185    18187       +2     
- Misses      12613    12618       +5     
- Partials      826      827       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

titaiwangms · 2025-01-23T23:00:33Z

@liqunfu @gramalingam @xadupre @justinchuby PTAL

I am not sure if I will have to update the spec or something?

titaiwangms · 2025-01-24T00:21:30Z

My newly added test is failing with onnx/test/test_backend_onnxruntime.py, which is expected because ORT also needs an update: microsoft/onnxruntime#16752. So should I disable the test for the time being?

justinchuby · 2025-01-24T00:25:33Z

Yes. Please feel free to disable

Signed-off-by: titaiwangms <titaiwang@microsoft.com>

Deprecate all type casting functions because they are mere utilities or reference implementations that should not be directly exposed to users. Users should leverage libraries like `ml_dtypes` for type conversion. - Additionally turned convert_endian and split_complex_to_pairs into private. This is seen as low risk. - Added typing_extensions as a dependency. This is a common dependency used by python packages as a polyfill for the typing module. --------- Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com> Signed-off-by: titaiwangms <titaiwang@microsoft.com>

### Description Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage. Updated text in the spec for int32_data for better readability and accuracy. ### Motivation and Context Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version. --------- Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com> Signed-off-by: titaiwangms <titaiwang@microsoft.com>

Signed-off-by: titaiwangms <titaiwang@microsoft.com>

### Description Unify names of github environments I already updated the github environments. Signed-off-by: Andreas Fehlner <fehlner@arcor.de> Signed-off-by: titaiwangms <titaiwang@microsoft.com>

Signed-off-by: titaiwangms <titaiwang@microsoft.com>

Fix #16203 Previous to this PR, if `ceil_mode` is on, the calculation of a value would divide the kernel size, even if remaining pixels is less than the kernel size, which causes the difference in this operator between ORT and torch. However, this fix only applies to the change in #15597, which only supports AvgPool since 19. The older opset version is remain the same, as it's using mlas files. Also, the PR fixes the shape mismatch caused by sliding window starting from padding. More detail: onnx/onnx#6650 (And this PR is also validated with the tests added in onnx/onnx#6650)

justinchuby

Need another eye on the logic

### Description Pooling in reference currently has two bugs: (1) it has ["pads required"](https://github.com/onnx/onnx/blob/e292b4ae6d016c3231a801bfeb26f802ba95d82a/onnx/reference/ops/op_pool_common.py#L53) to make sure sliding window does not go out-of-bound, but it does not exclude pads required from pooling caculations. This causes microsoft/onnxruntime#16203 (2) In `ceil_mode`, the reference pooling pads the input image, but does not check if sliding window starts on pads. When we reach the end of the image, the window should stop working. This causes pytorch/pytorch#131272. Not directly, but related fix on MaxPooling: #5741. Detail: pytorch/pytorch#116420 (comment) ### Motivation and Context This PR fixes the two bugs and update their tests accordingly. I also drafted a PR to onnxruntime to have this update in CPU provider: microsoft/onnxruntime#16752 --------- Signed-off-by: titaiwangms <titaiwang@microsoft.com> Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com> Signed-off-by: Andreas Fehlner <fehlner@arcor.de> Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com> Co-authored-by: Andreas Fehlner <fehlner@arcor.de> Signed-off-by: Andreas Fehlner <fehlner@arcor.de>

Fix #16203 Previous to this PR, if `ceil_mode` is on, the calculation of a value would divide the kernel size, even if remaining pixels is less than the kernel size, which causes the difference in this operator between ORT and torch. However, this fix only applies to the change in #15597, which only supports AvgPool since 19. The older opset version is remain the same, as it's using mlas files. Also, the PR fixes the shape mismatch caused by sliding window starting from padding. More detail: onnx/onnx#6650 (And this PR is also validated with the tests added in onnx/onnx#6650)

### Description Pooling in reference currently has two bugs: (1) it has ["pads required"](https://github.com/onnx/onnx/blob/e292b4ae6d016c3231a801bfeb26f802ba95d82a/onnx/reference/ops/op_pool_common.py#L53) to make sure sliding window does not go out-of-bound, but it does not exclude pads required from pooling caculations. This causes microsoft/onnxruntime#16203 (2) In `ceil_mode`, the reference pooling pads the input image, but does not check if sliding window starts on pads. When we reach the end of the image, the window should stop working. This causes pytorch/pytorch#131272. Not directly, but related fix on MaxPooling: onnx#5741. Detail: pytorch/pytorch#116420 (comment) ### Motivation and Context This PR fixes the two bugs and update their tests accordingly. I also drafted a PR to onnxruntime to have this update in CPU provider: microsoft/onnxruntime#16752 --------- Signed-off-by: titaiwangms <titaiwang@microsoft.com> Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com> Signed-off-by: Andreas Fehlner <fehlner@arcor.de> Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com> Co-authored-by: Andreas Fehlner <fehlner@arcor.de> Signed-off-by: seungwoo-ji <seungwoo.ji@nuvilab.com>

Fix #16203 Previous to this PR, if `ceil_mode` is on, the calculation of a value would divide the kernel size, even if remaining pixels is less than the kernel size, which causes the difference in this operator between ORT and torch. However, this fix only applies to the change in #15597, which only supports AvgPool since 19. The older opset version is remain the same, as it's using mlas files. Also, the PR fixes the shape mismatch caused by sliding window starting from padding. More detail: onnx/onnx#6650 (And this PR is also validated with the tests added in onnx/onnx#6650)

titaiwangms force-pushed the titaiwang/fix_avgpool_output_shape branch from 2b69b98 to 71d6065 Compare January 22, 2025 23:56

titaiwangms marked this pull request as ready for review January 23, 2025 17:17

titaiwangms requested a review from a team as a code owner January 23, 2025 17:17

titaiwangms force-pushed the titaiwang/fix_avgpool_output_shape branch from ed32535 to 280cd3c Compare January 23, 2025 17:19

titaiwangms requested a review from a team as a code owner January 23, 2025 17:19

justinchuby added the module: reference implementation label Jan 23, 2025

titaiwangms force-pushed the titaiwang/fix_avgpool_output_shape branch from 18bcb1b to a790113 Compare January 23, 2025 20:04

titaiwangms mentioned this pull request Jan 23, 2025

Align AvgPool ceil_mode on last value to torch microsoft/onnxruntime#16752

Merged

titaiwangms force-pushed the titaiwang/fix_avgpool_output_shape branch from e47ddaa to 231f353 Compare January 23, 2025 23:03

titaiwangms and others added 12 commits January 24, 2025 01:33

draft

6b8c1bc

Signed-off-by: titaiwangms <titaiwang@microsoft.com>

op_common

d9e1f04

Signed-off-by: titaiwangms <titaiwang@microsoft.com>

add comment

53f68a8

Signed-off-by: titaiwangms <titaiwang@microsoft.com>

keep condition

47177b1

Signed-off-by: titaiwangms <titaiwang@microsoft.com>

update pool tests

42f3a65

Signed-off-by: titaiwangms <titaiwang@microsoft.com>

add a new test

0595b75

Signed-off-by: titaiwangms <titaiwang@microsoft.com>

Unify names of github environments (onnx#6651)

e7bb8db

### Description Unify names of github environments I already updated the github environments. Signed-off-by: Andreas Fehlner <fehlner@arcor.de> Signed-off-by: titaiwangms <titaiwang@microsoft.com>

update model to use count_include_pad

c5ff759

Signed-off-by: titaiwangms <titaiwang@microsoft.com>

update mds

0913218

Signed-off-by: titaiwangms <titaiwang@microsoft.com>

disable ort test for now

1ba218d

Signed-off-by: titaiwangms <titaiwang@microsoft.com>

titaiwangms force-pushed the titaiwang/fix_avgpool_output_shape branch from f9765fc to 1ba218d Compare January 24, 2025 01:33

Merge branch 'main' into titaiwang/fix_avgpool_output_shape

4a3a166

justinchuby approved these changes Jan 24, 2025

View reviewed changes

xadupre approved these changes Jan 24, 2025

View reviewed changes

xadupre added this pull request to the merge queue Jan 24, 2025

Merged via the queue into onnx:main with commit dcd14ba Jan 24, 2025
34 checks passed

botantony mentioned this pull request May 14, 2025

onnx 1.18.0 Homebrew/homebrew-core#223395

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix pooling pads issues #6650

Fix pooling pads issues #6650

Uh oh!

titaiwangms commented Jan 22, 2025 •

edited

Loading

Uh oh!

justinchuby commented Jan 23, 2025

Uh oh!

titaiwangms commented Jan 23, 2025

Uh oh!

codecov bot commented Jan 23, 2025 •

edited

Loading

Uh oh!

titaiwangms commented Jan 23, 2025 •

edited

Loading

Uh oh!

titaiwangms commented Jan 24, 2025

Uh oh!

justinchuby commented Jan 24, 2025

Uh oh!

justinchuby left a comment

Uh oh!

Uh oh!

Uh oh!

Fix pooling pads issues #6650

Fix pooling pads issues #6650

Uh oh!

Conversation

titaiwangms commented Jan 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

justinchuby commented Jan 23, 2025

Uh oh!

titaiwangms commented Jan 23, 2025

Uh oh!

codecov bot commented Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

titaiwangms commented Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

titaiwangms commented Jan 24, 2025

Uh oh!

justinchuby commented Jan 24, 2025

Uh oh!

justinchuby left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

titaiwangms commented Jan 22, 2025 •

edited

Loading

codecov bot commented Jan 23, 2025 •

edited

Loading

titaiwangms commented Jan 23, 2025 •

edited

Loading