-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
(2.12) Stream leader should propose consumer remaps once assignment processed #7071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d13b298
to
15f4c05
Compare
5399212
to
ee5277d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
735ec1d
to
9034b73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - one minor nit.
@@ -29,6 +29,8 @@ import ( | |||
"testing" | |||
"time" | |||
|
|||
"slices" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PLace with all other single name imports
Signed-off-by: Neil Twigg <neil@nats.io>
Signed-off-by: Neil Twigg <neil@nats.io>
9034b73
to
7928ef6
Compare
The way that consumer assignments were proposed was changed in #7071 so that the stream leader is responsible for those updates instead of the metaleader. This subtly changes the timing involved in stream moves and cancels, as it is now not a given that the consumer assignments are always serialised with stream assignments in the same order. If a consumer assignment fails due to a cancelled stream move for example, we should not propose deleting the consumer, as the new stream leader will shortly come along and update the assignment correctly for the new placement. Signed-off-by: Neil Twigg <neil@nats.io>
Revert #7071, since it would result in losing consumer state when scaling down a replicated stream and the consumer state would need to be moved, but the current stream leader doesn't host that consumer. `TestJetStreamClusterConsumerReplicasAfterScale` started flaking due to this bug. `TestJetStreamClusterConsumerReplicasAfterScaleMoveConsumer` has been added in this PR which specifically tests for this situation (the former test would only hit that randomly, hence it flaking). We'll need to revisit this later when implementing the stream-managed consumers, which isn't going to be for 2.12. Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Right now, in a stream scale up/scale down scenario where there are consumers that also need to be scaled, the metaleader makes a proposal to update the stream and then immediately makes proposals to update the consumers. However, this does not account for the fact that the stream proposal could fail to be applied, which is where we have sometimes seen peer set drifts.
This moves this logic to take place on the stream leader once the stream update proposal is applied instead.
This will also be necessary in a world where the consumer assignments are managed by the stream leader, as the metaleader will not know anything about the stream NRG.
Signed-off-by: Neil Twigg neil@nats.io