Skip to content

Conversation

dnikolaev-amd
Copy link

@dnikolaev-amd dnikolaev-amd commented Aug 20, 2025

Currently std::min -> ::min did not work as expected on ROCm when input values >= 2147483648

Replace std::min to ternary statement
Also std::min can be replaced by explicit typing std::min<int64_t>

fixes on ROCm:
test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_large_cuda_float16
error:
RuntimeError: Cannot sort dimension of length 8192

Combines upstream PRs:

Fixes: SWDEV-526432

Cherry-picked to release/2.7 branch via #2570

Cherry-picked to release/2.8 branch via #2571

@dnikolaev-amd
Copy link
Author

! cherry-pick --onto release/2.6 release/2.7 release/2.8

@dhonnappa-amd
Copy link

The cherry-pick keyword can be used only with the merged PRs

Comment processed by Build

@dhonnappa-amd
Copy link

The cherry-pick keyword can be used only with the merged PRs

Comment processed by Build

Currently std::min -> ::min did not work as expected on ROCm when input values >= 2147483648
@dnikolaev-amd dnikolaev-amd force-pushed the dnikolaev/fix_large_tensor_sort_on_rocm branch from 8c7a369 to e78c36c Compare August 20, 2025 15:26
@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Aug 20, 2025

Jenkins build for e78c36cc293d66b50ecfa35db17e984b31e7cba4 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@@ -238,7 +239,8 @@ void launch_stable_sort_kernel(
scalar_t* values_ptr = values.mutable_data_ptr<scalar_t>();
int64_t remaining = numel;
while (remaining > 0) {
int64_t n = std::min(remaining, nbatch);
// On ROCm, std::min -> ::min did not work as expected on when input values >= 2147483648
int64_t n = remaining < nbatch ? remaining : nbatch;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an offline discussion, we discussed why use ternary operator instead of the std::min<int64_t> templated version.

@dnikolaev-amd verified that std::min<int64_t> does not get hipified to ::min<int64_t> it remains std::min<int64_t>.

@jeffdaily said "The only argument I can think of against using std::min<int64_t> in device code is generally not recommended. I remember Andy raised an issue about it. pytorch#128253"

However, the usages being addressed in this PR are in host code IIUC. So is there a reason we should still use ternary operator instead?

Copy link
Collaborator

@jithunnair-amd jithunnair-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for this branch since it's just for unit tests, but we should align on whether we want the same approach for upstream.

@jithunnair-amd jithunnair-amd merged commit 1455054 into rocm7.1_internal_testing Aug 25, 2025
1 of 3 checks passed
@jithunnair-amd jithunnair-amd deleted the dnikolaev/fix_large_tensor_sort_on_rocm branch August 25, 2025 16:49
@dnikolaev-amd
Copy link
Author

! cherry-pick --onto release/2.7 release/2.8

dhonnappa-amd pushed a commit that referenced this pull request Aug 25, 2025
Currently std::min -> ::min did not work as expected on ROCm when input
values >= 2147483648

Replace std::min to ternary statement
Also std::min can be replaced by explicit typing std::min<int64_t>

fixes on ROCm:

test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_large_cuda_float16
error:
RuntimeError: Cannot sort dimension of length 8192

Combines upstream PRs:
- pytorch#161054 to fix std::min on ROCm
- pytorch#155546 fix python test
- pytorch#159939 change test dtype from
int8 to float16

Fixes: SWDEV-526432
dhonnappa-amd pushed a commit that referenced this pull request Aug 25, 2025
Currently std::min -> ::min did not work as expected on ROCm when input
values >= 2147483648

Replace std::min to ternary statement
Also std::min can be replaced by explicit typing std::min<int64_t>

fixes on ROCm:

test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_large_cuda_float16
error:
RuntimeError: Cannot sort dimension of length 8192

Combines upstream PRs:
- pytorch#161054 to fix std::min on ROCm
- pytorch#155546 fix python test
- pytorch#159939 change test dtype from
int8 to float16

Fixes: SWDEV-526432
@dhonnappa-amd
Copy link

Created branch autogenerated/release/2.7_cherry-pick_pr-2543 and #2570

Created branch autogenerated/release/2.8_cherry-pick_pr-2543 and #2571

Comment processed by Build

pruthvistony pushed a commit that referenced this pull request Sep 2, 2025
Cherry-pick of #2543

Co-authored-by: Dmitry Nikolaev <139769634+dnikolaev-amd@users.noreply.github.com>
pruthvistony pushed a commit that referenced this pull request Sep 2, 2025
Cherry-pick of #2543

Co-authored-by: Dmitry Nikolaev <139769634+dnikolaev-amd@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants