Skip to content

Conversation

KShivendu
Copy link
Member

@KShivendu KShivendu commented Jul 16, 2025

Improvements discussed with @timvisee to existing PR #6800. Created a separate PR so I can test images in my CM test

Note: These changes also highlight the larger effect of the PR since this abort_shard_transfer_and_resharding is called at multiple places. We should be careful about it and think more carefully if it can break anything.

Update: CI is failing because of #6859 and 2nd one because of not being able to fetch web UI. 1st one needs rebase and 2nd one should be resolved by re-triggering CI. Will do!

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

@KShivendu KShivendu force-pushed the fix-resharding-dead-replicas-improvements branch from 349eec4 to 07d6bfb Compare July 16, 2025 10:08
@KShivendu KShivendu changed the base branch from fix-resharding-dead-replicas to dev July 16, 2025 10:08
@KShivendu KShivendu force-pushed the fix-resharding-dead-replicas-improvements branch from 07d6bfb to a9f4869 Compare July 16, 2025 10:09
@KShivendu KShivendu changed the base branch from dev to fix-resharding-dead-replicas July 16, 2025 10:09
@KShivendu KShivendu merged commit e53e354 into fix-resharding-dead-replicas Jul 16, 2025
18 checks passed
@KShivendu KShivendu deleted the fix-resharding-dead-replicas-improvements branch July 16, 2025 12:21
KShivendu added a commit that referenced this pull request Jul 16, 2025
…esharding (#6800)

* Fix bug that causes all replicas to die if node is restarted during resharding

* fix recursive async problem

* Avoid write lock unless required

* avoid using &mut when & is sufficient

* Remove is_in_progress since check_abort_resharding exists (#6806)

* On resharding abort, only abort transfers related to current operation

* Fix resharding dead replicas improvements (#6881)

* Apply suggestions to fix resharding dead replicas

* fmt

---------

Co-authored-by: timvisee <tim@visee.me>
generall pushed a commit that referenced this pull request Jul 17, 2025
…esharding (#6800)

* Fix bug that causes all replicas to die if node is restarted during resharding

* fix recursive async problem

* Avoid write lock unless required

* avoid using &mut when & is sufficient

* Remove is_in_progress since check_abort_resharding exists (#6806)

* On resharding abort, only abort transfers related to current operation

* Fix resharding dead replicas improvements (#6881)

* Apply suggestions to fix resharding dead replicas

* fmt

---------

Co-authored-by: timvisee <tim@visee.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants