-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Improve the CiliumNode to KVStore synchronization logic of the Cilium operator #35840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
20d8250
to
be62e77
Compare
/test |
Ci is reasonably green, besides unrelated flakes. Dropping the test commits. |
be62e77
to
d8bcfc9
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 💯
Currently, the operator includes a logic (enabled by default) to synchronize CiliumNode objects into the kvstore whenever a change is detected. Let's always skip it when support for running the kvstore in pod network is disabled, as not required in that case. Indeed, each agent is always assumed to be able to connect to the kvstore and keep it up-to-date, otherwise connectivity to that node is broken anyways. Yet, the fallback synchronization logic causes unnecessary churn and load on both etcd and all watching agents due to the extra events, especially upon operator restart. Differently, let's preserve the logic deleting stale node entries, as the corresponding Cilium agent cannot take care of it, and we would otherwise only rely on lease expiration (which might take a while depending on its duration). Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
In preparation for the subsequent changes, let's modify the CiliumNode synchronization handlers so that they can return an error, eventually causing the operation to be retried via the working queue. All consumers are amended to return a nil error, besides the kvstore one, that propagates the possible error returned by the kvstore. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
The operator already includes a reconciliation logic to delete node entries from the kvstore when the corresponding CiliumNode gets deleted. However, it is currently affected by a race condition, because the CiliumNode gets typically deleted as a consequence of the Node object being deleted (as they are bound via an owner reference), although at that point the Cilium agent on that node may still be running. If that's the case, it would detect the removal of the kvstore entry, and react recreating it. Hence, defeating the whole purpose of the operator logic, and leading to the node entry being eventually deleted by the lease expiration only. The occurrence of this race condition is typically signaled by the "Received delete event for key which re-appeared within delay time window" log message. Let's address this by extending the GC logic to wait deleting the node entry until the Cilium agent running on that node has been stopped, to make sure it will not be recreated right away. The rate limiter of the associated working queue is also customized to increase the base delay (5ms by default), considering that it is pointless to retry with a high frequency in this case. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
d8bcfc9
to
99cabf9
Compare
Rebased onto main, and removed the reviewers triggered due to the automatic switch of the base branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, lgtm!
/test |
Extend the operator logic in charge of synchronizing the CiliumNodes into the corresponding KVStore representation to:
Please review commit by commit, and refer to the individual commit messages for additional details.