Skip to content

Conversation

chuanqi129
Copy link
Collaborator

Fixes #149422

@chuanqi129 chuanqi129 requested a review from a team as a code owner March 27, 2025 04:17
Copy link

pytorch-bot bot commented Mar 27, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/150084

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 1 New Failure

As of commit 153ea95 with merge base 7243c69 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@chuanqi129 chuanqi129 added topic: not user facing topic category ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR labels Mar 27, 2025
@chuanqi129 chuanqi129 requested review from malfet and atalman March 27, 2025 14:15
@chuanqi129 chuanqi129 added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 27, 2025
@chuanqi129
Copy link
Collaborator Author

Hi @malfet , could you please help to review this PR also which fix a high priority issue about bundled gomp in cpu wheels? The failures seems aren't related to the PR changes

@malfet
Copy link
Contributor

malfet commented Mar 27, 2025

Hi @chuanqi129 what's the test plan for this PR? Or how one expects to verify that something is really fixed?

@chuanqi129
Copy link
Collaborator Author

chuanqi129 commented Mar 27, 2025

Hi @chuanqi129 what's the test plan for this PR? Or how one expects to verify that something is really fixed?

Good question. We have verified it in local by @yuchengliu1 with the case from the issue #149422, but I also didn't figure out a good method to test it in CD whl smoke test script. @malfet any suggestion?

@mikaylagawarecki mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 31, 2025
@atalman
Copy link
Contributor

atalman commented Mar 31, 2025

Hi @chuanqi129 could you please post log of running the test from #149422 (comment) on the binary downloaded from https://github.com/pytorch/pytorch/actions/runs/14098800119?pr=150084 ?

Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From PR description (and lack of tests) it's unclear whether this change really fixes the reported issue or not

@pytorch-bot pytorch-bot bot removed ciflow/trunk Trigger trunk jobs on your pull request ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR labels Apr 2, 2025
@LifengWang
Copy link
Contributor

LifengWang commented Apr 2, 2025

Update the test results with the latest changes as follows.

Test results with the nightly wheel:

[root@icx-6 ~]# python check_gomp.py
omp_max_threads after loading libgomp.so and libtorch_cpu.so: 1
Traceback (most recent call last):
  File "/root/check_gomp.py", line 74, in <module>
    main()
    ~~~~^^
  File "/root/check_gomp.py", line 68, in main
    raise RuntimeError(
        "omp_max_threads is 1. Check whether libgomp.so is loaded twice."
    )
RuntimeError: omp_max_threads is 1. Check whether libgomp.so is loaded twice.

Test results with the fixed wheel:

[root@icx-6 ~]# python check_gomp.py
omp_max_threads after loading libgomp.so and libtorch_cpu.so: 112

@atalman atalman modified the milestones: 2.7.0, 2.7.1 Apr 2, 2025
@chuanqi129 chuanqi129 requested a review from malfet April 7, 2025 04:08
@chuanqi129 chuanqi129 added the ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR label Apr 7, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR label Apr 8, 2025
@chuanqi129 chuanqi129 added the ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR label Apr 8, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR label Apr 9, 2025
@chuanqi129 chuanqi129 added the ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR label May 9, 2025
@chuanqi129
Copy link
Collaborator Author

@pytorchbot rebase -b viable/strict

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased cd_fix_gomp onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout cd_fix_gomp && git pull --rebase)

@pytorch-bot pytorch-bot bot removed ciflow/trunk Trigger trunk jobs on your pull request ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR labels May 13, 2025
@chuanqi129 chuanqi129 added ciflow/trunk Trigger trunk jobs on your pull request ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR labels May 13, 2025
@atalman
Copy link
Contributor

atalman commented May 14, 2025

@pytorchmergebot merge -f "all required tests are passing"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@chuanqi129
Copy link
Collaborator Author

@pytorchbot cherry-pick --onto release/2.7 -c critical

@pytorchbot
Copy link
Collaborator

Cherry picking #150084

Command git -C /home/runner/work/pytorch/pytorch cherry-pick -x 20dbe644c702c1a8a908b3c6cb5397c60f541d72 returned non-zero exit code 1

Auto-merging .ci/manywheel/build_common.sh
Auto-merging .circleci/scripts/binary_linux_test.sh
CONFLICT (content): Merge conflict in .circleci/scripts/binary_linux_test.sh
error: could not apply 20dbe644c70... [CD] Fix the libgomp twice load issue (#150084)
hint: After resolving the conflicts, mark them with
hint: "git add/rm <pathspec>", then run
hint: "git cherry-pick --continue".
hint: You can instead skip this commit with "git cherry-pick --skip".
hint: To abort and get back to the state before "git cherry-pick",
hint: run "git cherry-pick --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Details for Dev Infra team Raised by workflow job

chuanqi129 added a commit to chuanqi129/pytorch that referenced this pull request May 14, 2025
pytorchmergebot pushed a commit to chuanqi129/pytorch that referenced this pull request May 15, 2025
atalman pushed a commit that referenced this pull request May 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR ciflow/trunk Trigger trunk jobs on your pull request Merged open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pip-installed pytorch limits threads to 1 when setting GOMP_CPU_AFFINITY (likely due to bundled GOMP)
8 participants