Skip to content

Conversation

choffmeister
Copy link
Contributor

This ensures that gracefully leaving nodes (which would terminate after
their own removed event) are already unregistered on all other pubsub
nodes, before termination.

See #20826

@akka-ci
Copy link

akka-ci commented Jun 24, 2016

Can one of the repo owners verify this patch?

nodes -= m.address
registry -= m.address
}

case MemberRemoved(m, _) if m.address == selfAddress ⇒
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the if should be after => otherwise it will be "logged" as unhandled

It should also do the same thing as before on MemberRemoved, because a member might be removed without leaving.

Another thing we should consider is if it would still be interesting to use the leaving/exiting nodes for the broadcast Publish messages, but not for the single destination Send messages.

@choffmeister
Copy link
Contributor Author

choffmeister commented Jun 24, 2016

True. Will elaborate on that.

In the meanwhile I found a nifty quickfix. One just has to add the following actor (on each pubsub member node):

final class PubSubLeftNodesCleaner extends Actor with ActorLogging {
  val cluster = Cluster(context.system)
  val mediator = DistributedPubSub(context.system).mediator

  override def preStart() = cluster.subscribe(self, ClusterEvent.initialStateAsEvents, classOf[MemberLeft])
  override def postStop() = cluster.unsubscribe(self)

  override def receive = {
    case MemberLeft(member) =>
      // fake member removed event
      mediator ! MemberRemoved(member.copy(status = Removed), member.status)
  }
}

@choffmeister
Copy link
Contributor Author

Having this possibility one could argue, that this PR is not even needed. Although I still think, cleaning nodes on MemberLeft should still be the default behaviour.

@choffmeister
Copy link
Contributor Author

I think there are two reasons to also not broadcast to leaving nodes:

  1. A node that requested to leave should be treated as if it is already gone. The leaving node could as well just kill itself, but is so kind to wait for everyone to acknowledge. The other nodes should return the favor and not disturb the leaving node further :)
  2. In consistency with cluster aware routers which also ignore leaving nodes

@drewhk
Copy link
Contributor

drewhk commented Jun 27, 2016

I agree, the whole point of graceful leave is that everyone is notified in advance and starts to re-route traffic before the node is fully gone, resulting in less lost messages.

@patriknw
Copy link
Contributor

Sounds good to me

@patriknw
Copy link
Contributor

LGTM

@ktoso
Copy link
Contributor

ktoso commented Jun 29, 2016

OK TO TEST

@ktoso
Copy link
Contributor

ktoso commented Jun 29, 2016

Just noticed tests didn't run yet

@akka-ci akka-ci added validating PR is currently being validated by Jenkins needs-attention Indicates a PR validation failure (set by CI infrastructure) and removed validating PR is currently being validated by Jenkins labels Jun 29, 2016
@akka-ci
Copy link

akka-ci commented Jun 29, 2016

Test FAILed.

This ensures that gracefully leaving nodes (which would terminate after
their own removed event) are already unregistered on all other pubsub
nodes, before termination.
@choffmeister
Copy link
Contributor Author

Ups my bad. Now test should run through

@akka-ci akka-ci added validating PR is currently being validated by Jenkins tested PR that was successfully built and tested by Jenkins and removed needs-attention Indicates a PR validation failure (set by CI infrastructure) validating PR is currently being validated by Jenkins labels Jun 30, 2016
@akka-ci
Copy link

akka-ci commented Jun 30, 2016

Test PASSed.

@patriknw patriknw merged commit f0f755b into akka:master Jul 5, 2016
@patriknw
Copy link
Contributor

patriknw commented Jul 5, 2016

Thanks for contributing, @choffmeister

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tested PR that was successfully built and tested by Jenkins
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants