-
Notifications
You must be signed in to change notification settings - Fork 3.6k
useRole ignored on local, because routers unaware of roles locally #15822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pull request validation: SUCCESS 👍 |
I might be missing something but why does the ClusterRouterGroup class need to make the decision? ClusterRouterSettingsBase already contains the |
By the time we come to the Pool spinning up actors it's too-late* as in "changes behaviour slightly". Leaving it completely up to the Pool is equal to my Now: case pool: Pool ⇒
if (pool.nrOfInstances > 0) // the impl of this is not aware of cluster roles, returns maxNr!
addRoutees(Vector.fill(pool.nrOfInstances)(pool.newRoutee(routeeProps, this))) which starts So the simple fix ( In summary and when thinking about it... |
What I mean is that instead of returning 0 from nrOfInstances the calling code could just come to the same conclusion by inspecting the settings. And that code very likely does have an ActorSystem at its disposal. |
But I’m fine with your conclusion as well; I don’t think any expectations are violated by not creating local routees when they are not called for ;-) And if it is easy to create the local routees eagerly nevertheless then we might do it just to be nice, but that is not required. |
@rkuhn I think you are right that the calling code could figure it out, but the problem is that right now the calling code is from the general router code (not aware of such things). Not impossible to change though. However, we should not rush this, we should think one more time about these kind of things related to the cluster aware routers (and I have swapped focus away this right now). |
agreed |
Totally agreed, don't want to rush this one, as there seems to be confusion around cluster-routers anyway (the #13802). I'll try to dig up some time to hack on this more soon. |
Pull request validation: SUCCESS 👍 |
@ktoso I think that sounds good. Overall I think it would be good if the cluster aware pool could create the local routee early if allowLocal=true and the role is ok, because then you can start using the ActorRef immediately without loosing messages. However, I think we should check the real cluster membership status and only do this if the node is Up. Currently that is not the case, it happily creates the routee assuming that the local node will become Up soon.
|
8fe31b9
to
e870cbd
Compare
@patriknw good hint about starting only when UP, incorporated this into the method ( ktoso@e870cbd#diff-5a96a7bdef4c972bff8f8f825b6640acR162 impl of it stays pretty much the same, if we go with 2nd proposal or 3rd proposal ). I pushed another idea, which would boil down to having "SimplePool" and "ExtendedPool", with the idea that extended pool "can do whatever it needs", currently that's only during So far I think the simplest would be the 2nd proposal, that is adding ActorSystem to Would you guys mind having a look again, so we can sattle on 2 or 3 and then I'll squash and do a migration guide? |
I like the 2nd proposal. |
Pull request validation: FAILED 👎 |
/note Didn't get enough time to revisit and clean it up properly today. Will re-visit it once back I guess. |
e870cbd
to
62bd7f0
Compare
Went with the above discussed 2nd approach (as in |
25109a4
to
82e179c
Compare
PLS BUILD |
Test PASSed. |
Test FAILed. |
|
||
In order to make cluster routers smarter about when they can start local routees, | ||
``nrOfInstances`` defined on ``Pool`` now takes ``ActorSystem`` as an argument. | ||
You will in case you have implemented a custom Pool you will have to update the method's signature, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case you ....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks :)
LGTM!!!!!!! |
Test PASSed. |
This is an API breaking change if someone implemented their own Routers. The change is required because the router must know if the local routees should be started or not so it has to check the roles of the cluster member (the local one). We could delay this decision of starting local routees, but that would allow messages to be dead-letter-ed (bad).
82e179c
to
3f12ef2
Compare
Fixed typo in migration guide and... |
useRole ignored on local, because routers unaware of roles locally
FOR DISCUSSION, see discussion in related topic: #15042
In general the question is
if we want to make cluster routers take an actor system / cluster or not. There's also a broader question, as this issue is somewhat related to Cluster Aware Routers: nr-of-instances parameter usage #13802 (
nr-of-instances
being confusing), which may point to that cluster routers should become more specialised?The issue itself can be "solved" easily by returning
0
from thenrOfInstances
, but that makes it possible to deadLetter some initial messages sent to the router if normally (when usingallowLocal
), that would not be the case. This safety though is a bit accidental, and maybe that's fine for fixing this issue?If we don't want to break APIs, this could still be solved, but by setting the actor system "later" from "the inside", but's going to endup pretty ugly I think...
May eventually resolve #15042