-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
This is not my idea, but I'm documenting my understanding of it here for discussion so that we can come up with an action plan.
First, a primer on how Orleans finds where a grain activation lives - correct me if I'm wrong:
- Orleans supports a flexible placement model where the placement of an activation is dynamic and guided by its placement policy rather than being restricted to a fixed calculation.
- To support this model, Orleans maintains a distributed lookup table of GrainId -> ActivationId, where ActivationId points to the silo which hosts an activation
- This distributed table is partitioned across all silos in the cluster, where each silo holds one partition of the directory.
- Silos are placed around conceptual ring based upon a hash of their
SiloAddress
(similar to nodes in a Chord DHT). - In order to find where a grain activation exists in the cluster, its
GrainId
is mapped to a silo on that ring by finding the silo with the largesthash(SiloAddress)
less than thehash(GrainId)
. This is the Primary Directory Partition for this grain. - When a silo is added to the cluster, the directory on each silo is notified and they each perform a hand-off for any grains which have a new primary directory partition (based on the above algorithm).
- When a silo is removed, similar rebalancing happens so that new activations can be placed on a surviving silo.
- Each silo maintains a local cache of parts of the distributed table as an optimization, similar to a DNS cache.
That's how a directory partition handles cluster membership changes, but what happens to the actual activations (Grain
instances) during a membership change? Currently, if a silo dies, every activation whose primary directory partition was on that silo is eagerly deactivated (see Catalog.SiloStatusChangeNotification
). That is, if SiloA has an activation whose primary directory partition is on SiloB and SiloB dies, then SiloA will kill that activation. This can cause a large amount of activation churn, particularly in small clusters.
The proposal is to register those activations with the correct, surviving directory partition instead of deactivating them.
We must maintain a few invariants while implementing this optimization:
- Activations must eventually converge to at most one per grain.
- Activations which are tracked using the directory cannot be allowed to exist without being registered in the directory - they cannot be orphaned.
- Activations must eventually be registered in the correct directory partition.
Are there nuances here which I've missed or is this too vague?