Skip to content

Cluster.MemberRemoved does not always fire #2492

@Aaronontheweb

Description

@Aaronontheweb

Looks like #2455 is a symptom of this issue.

Under some circumstances, MemberRemoved is not correctly propagated from the leader to its children.

Some context from an end-user in our Gitter chat earlier today:

When a node gets into this state it doesn't leave cleanly. It tries to, I have a monitor running in all service discovery nodes (2 of them). They both report the cluster status that they see. When this problem happens the cluster status us everything UP and everything Seen. The leader gets the request that the node is exiting and this is logged every second that it is moving the node to exiting but it never exits. After 15 seconds my windows service will kill the service and failure detection will kick in. I manually down the node although just starting it again causes the cluster to see the new node and remove the old one. The cluster monitors both report the node is removed. The node rejoins and gets stuck.

Not entirely clear what the exact issue is, but this should give us enough information to go looking for it.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions