Skip to content

Shard region registration to more potential oldest, #28416 #28470

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 22, 2020

Conversation

patriknw
Copy link
Contributor

  • Keep track of Leaving and Exiting members in ShardRegion and
    attempt to register to coordinator at several of the oldest if
    they have status Leaving and Exiting. Include all up to and
    including the first member with status Up.
  • Sending to wrong node doesn't matter, will be suppressed deadLetter.
  • Same for the GracefulShutdownReq which already had that intention by
    sending to 2 oldest.

Refs #28416

* Keep track of Leaving and Exiting members in ShardRegion and
  attempt to register to coordinator at several of the oldest if
  they have status Leaving and Exiting. Include all up to and
  including the first member with status Up.
* Sending to wrong node doesn't matter, will be suppressed deadLetter.
* Same for the GracefulShutdownReq which already had that intention by
  sending to 2 oldest.
@akka-ci akka-ci added the validating PR is currently being validated by Jenkins label Jan 10, 2020
/**
* Test for issue #28416
*/
object ClusterShardingRegistrationCoordinatedShutdownSpec extends MultiNodeConfig {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to reproduce the originally reported scenario, and this test doesn't fail without the changes. I'm out of ideas for how to reproduce the problem. It was support case 13482.

@akka-ci akka-ci added tested PR that was successfully built and tested by Jenkins and removed validating PR is currently being validated by Jenkins labels Jan 10, 2020
@akka-ci
Copy link

akka-ci commented Jan 10, 2020

Test PASSed.

case MemberLeft(m) =>
addMember(m)
case MemberExited(m) =>
addMember(m)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so it would go directly from Joining/WeaklyUp to Left or Exited and therefore we need to add it for those as well? Why do we need the memberStatusOfInterest if we only ever add those statuses anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for MemberLeft and MemberExited they will already be in the set, but we want to replace them since the status changed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memberStatusOfInterest is also used when receiving the CurrentClusterState. it's true that it wouldn't be necessary to check memberStatusOfInterest in addMember but I thought that would be more future proof

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for clarification.

}
}

select(Nil, membersByAge).map(m => context.actorSelection(RootActorPath(m.address).toString + coordinatorPath))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Um, so always all nodes that are up with oldest first? Shouldn't it be capped for larger clusters?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, in that case this could take a while.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always all nodes that are up with oldest first

no, that's not what this does. it collects nodes until it detects one that is Up and the result includes that single Up.
example

(Up, Up, Up) => (Up)
(leaving, Up, Up) => (Leaving, Up)
(leaving, Exiting, Up, Up) => (Leaving, Exiting, Up)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed the early return and thought the isEmpty was the single termination. Gotcha.

Copy link
Contributor

@helena helena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@patriknw patriknw merged commit b30f874 into master Jan 22, 2020
@patriknw patriknw deleted the wip-28416-region-reg-patriknw branch January 22, 2020 07:40
zbynek001 added a commit to zbynek001/akka.net that referenced this pull request Mar 24, 2020
* Keep track of Leaving and Exiting members in ShardRegion and
  attempt to register to coordinator at several of the oldest if
  they have status Leaving and Exiting. Include all up to and
  including the first member with status Up.
* Sending to wrong node doesn't matter, will be suppressed deadLetter.
* Same for the GracefulShutdownReq which already had that intention by
  sending to 2 oldest.

migrated from akka/akka#28470
zbynek001 added a commit to zbynek001/akka.net that referenced this pull request Mar 26, 2020
* Keep track of Leaving and Exiting members in ShardRegion and
  attempt to register to coordinator at several of the oldest if
  they have status Leaving and Exiting. Include all up to and
  including the first member with status Up.
* Sending to wrong node doesn't matter, will be suppressed deadLetter.
* Same for the GracefulShutdownReq which already had that intention by
  sending to 2 oldest.

migrated from akka/akka#28470
Aaronontheweb added a commit to akkadotnet/akka.net that referenced this pull request Mar 27, 2020
* Avoid dead letter for rebalance timeout msg

migrated from akka/akka#28274

* Reconsider cluster.role.<role-name>.min-nr-of-members fallback
migrated from akka/akka#28203

* Use dedicated message for PersistentShardCoodinator termination instead of PoisonPill
migrated from akka/akka#28104

* Don't initialize durable keys when rememberEntities not used
migrated from akka/akka#28568

* Improve error message when coordinator not found
migrated from akka/akka#28576

* Shard region registration to more potential oldest

* Keep track of Leaving and Exiting members in ShardRegion and
  attempt to register to coordinator at several of the oldest if
  they have status Leaving and Exiting. Include all up to and
  including the first member with status Up.
* Sending to wrong node doesn't matter, will be suppressed deadLetter.
* Same for the GracefulShutdownReq which already had that intention by
  sending to 2 oldest.

migrated from akka/akka#28470

* Log non-completed rebalance at warning level
migrated from akka/akka#28335

* api approval fix

* review fixes

Co-authored-by: Aaron Stannard <aaron@petabridge.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tested PR that was successfully built and tested by Jenkins
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants