-
Notifications
You must be signed in to change notification settings - Fork 72
[rocm7.1_internal_testing] fix large tensor sort on ROCm #2543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rocm7.1_internal_testing] fix large tensor sort on ROCm #2543
Conversation
! cherry-pick --onto release/2.6 release/2.7 release/2.8 |
The cherry-pick keyword can be used only with the merged PRs Comment processed by Build |
The cherry-pick keyword can be used only with the merged PRs Comment processed by Build |
Currently std::min -> ::min did not work as expected on ROCm when input values >= 2147483648
8c7a369
to
e78c36c
Compare
Jenkins build for e78c36cc293d66b50ecfa35db17e984b31e7cba4 commit finished as FAILURE |
@@ -238,7 +239,8 @@ void launch_stable_sort_kernel( | |||
scalar_t* values_ptr = values.mutable_data_ptr<scalar_t>(); | |||
int64_t remaining = numel; | |||
while (remaining > 0) { | |||
int64_t n = std::min(remaining, nbatch); | |||
// On ROCm, std::min -> ::min did not work as expected on when input values >= 2147483648 | |||
int64_t n = remaining < nbatch ? remaining : nbatch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In an offline discussion, we discussed why use ternary operator instead of the std::min<int64_t> templated version.
@dnikolaev-amd verified that std::min<int64_t> does not get hipified to ::min<int64_t> it remains std::min<int64_t>.
@jeffdaily said "The only argument I can think of against using std::min<int64_t> in device code is generally not recommended. I remember Andy raised an issue about it. pytorch#128253"
However, the usages being addressed in this PR are in host code IIUC. So is there a reason we should still use ternary operator instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for this branch since it's just for unit tests, but we should align on whether we want the same approach for upstream.
1455054
into
rocm7.1_internal_testing
! cherry-pick --onto release/2.7 release/2.8 |
Currently std::min -> ::min did not work as expected on ROCm when input values >= 2147483648 Replace std::min to ternary statement Also std::min can be replaced by explicit typing std::min<int64_t> fixes on ROCm: test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_large_cuda_float16 error: RuntimeError: Cannot sort dimension of length 8192 Combines upstream PRs: - pytorch#161054 to fix std::min on ROCm - pytorch#155546 fix python test - pytorch#159939 change test dtype from int8 to float16 Fixes: SWDEV-526432
Currently std::min -> ::min did not work as expected on ROCm when input values >= 2147483648 Replace std::min to ternary statement Also std::min can be replaced by explicit typing std::min<int64_t> fixes on ROCm: test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_large_cuda_float16 error: RuntimeError: Cannot sort dimension of length 8192 Combines upstream PRs: - pytorch#161054 to fix std::min on ROCm - pytorch#155546 fix python test - pytorch#159939 change test dtype from int8 to float16 Fixes: SWDEV-526432
Created branch autogenerated/release/2.7_cherry-pick_pr-2543 and #2570 Created branch autogenerated/release/2.8_cherry-pick_pr-2543 and #2571 Comment processed by Build |
Cherry-pick of #2543 Co-authored-by: Dmitry Nikolaev <139769634+dnikolaev-amd@users.noreply.github.com>
Cherry-pick of #2543 Co-authored-by: Dmitry Nikolaev <139769634+dnikolaev-amd@users.noreply.github.com>
Currently std::min -> ::min did not work as expected on ROCm when input values >= 2147483648
Replace std::min to ternary statement
Also std::min can be replaced by explicit typing std::min<int64_t>
fixes on ROCm:
test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_large_cuda_float16
error:
RuntimeError: Cannot sort dimension of length 8192
Combines upstream PRs:
Fixes: SWDEV-526432
Cherry-picked to release/2.7 branch via #2570
Cherry-picked to release/2.8 branch via #2571