Skip to content

Conversation

fegin
Copy link
Contributor

@fegin fegin commented May 24, 2024

Cherry-pick #127069 for 2.3.1

Summary:
Distributed state_dict should not error out because the model.state_dict() will trigger FSDP to initialize.

Pull Request resolved: #127069
Approved by: https://github.com/wz337

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang @d4l3k @LucasLLC

Copy link

pytorch-bot bot commented May 24, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/127130

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 645a929 with merge base 86a2d67 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added module: distributed_checkpoint oncall: distributed Add this issue/PR to distributed oncall triage queue topic: not user facing topic category labels May 24, 2024
@atalman
Copy link
Contributor

atalman commented May 27, 2024

@pytorchbot rebase -b release/2.3

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/release/2.3. Check the current status here

fegin and others added 2 commits May 27, 2024 12:38
Summary:
Distributed state_dict should not error out because the `model.state_dict()` will trigger FSDP to initialize.

Pull Request resolved: #127069
Approved by: https://github.com/wz337
@pytorchmergebot
Copy link
Collaborator

Successfully rebased chienchin/cherry-pick-pr-127069 onto refs/remotes/origin/release/2.3, please pull locally before adding more changes (for example, via git checkout chienchin/cherry-pick-pr-127069 && git pull --rebase)

@pytorchmergebot pytorchmergebot force-pushed the chienchin/cherry-pick-pr-127069 branch from a9b5371 to 645a929 Compare May 27, 2024 12:38
@atalman atalman merged commit 81b8854 into release/2.3 May 27, 2024
@atalman atalman deleted the chienchin/cherry-pick-pr-127069 branch May 27, 2024 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
oncall: distributed Add this issue/PR to distributed oncall triage queue topic: not user facing topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants