-
Notifications
You must be signed in to change notification settings - Fork 3.4k
daemon: Make cilium status independent from k8s status #32724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@tkna I'm going to mark this for more discussion. While I see your issue, I'm not sure if disabling liveliness probes is ideal from Cilium's point of view. I'm going to pull in some folks in your linked CFP for further discussion. Lets go over this in that issue. |
I think this solution makes sense. We have discussed about this issue in the APAC community meeting and concluded that killing the whole Cilium agent on k8s disconnection is too aggressive. We might want to kill the certain subsystem, but maybe not entire agent. For example, the real impact of k8s disconnection for BGP Control Plane is the user's configuration change may not be propagated in timely manner. BGP works without any problem otherwise. I'd assume most of the time, users want to avoid connectivity loss instead of configuration delay. We might have a subsystem which this behavior is desirable, so opt-in might be a safer option. However, in my personal opinion, we can just completely remove k8s connectivity from health criteria and fix such a subsystem to stop individually. Let me CC @joestringer as a person who was involved in the original discussion and also as a core maintainer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking for a while that the cilium-agent connectivity probably shouldn't depend on k8s-apiserver availability, so I think the idea here seems reasonable. I'm not sure if there is a specific argument to keep kube-apiserver connectivity as part of the base health check.
If we are concerned that we may need to retain the previous behavior, then it makes sense to keep it optional. I'm not sure whether we need this as part of the API, in which case we should add a flag to the cilium-dbg status
CLI in order to control this API parameter. Alternatively, we could just create a cilium-agent daemon flag to control this optional behaviour.
Another question I would have is: With this implementation, how can a user detect that Cilium is not connecting to the kube-apiserver? Previously it would be obvious at some point because the cilium-agent would crash. I suppose there may be metrics like cilium_event_ts
for the most recent event from the control plane.. but should we add another metric to make this more obvious?
I've been thinking about this PR too 😅 AFAIK controller that updates connectivity to apiserver is still running here: Line 804 in 0001f4b
So I would at least expect FYI, there is also a controller that is supposed to close all connections if the health check of the k8s apiserver fails: Line 290 in c329806
so if this part works, there is also that fail-safe mechanism.
There was some bug in the past with connectivity to kube-apiserver, that was only resolved with the restart of the agent: #20915 CC @hemanthmalla we talked about it before a little bit, I am guessing you might be interested in this too and probably have some thoughts as well. I guess you might want something similar for kvstore so potentially we might want to make it a bit more generic. |
Isn't one of the assumptions users can make about the restart behavior of the cilium-agent is that connectivity will continue to work even as the cilium-agent is killed or restarted? I'm confused by the comments here because they assume there is a coupling between cilium agent availability and Pod network connectivity. Are we referring to "connectivity loss" as connectivity that is out of sync with user configured state in the k8s apiserver? |
The issue is bgp advertisement. If cilium agent is offline when kube-api is unavailable, cilium also can't advertise routes to bgp peers so this effectively means kube-api availability now has impact on loadbalancer availability, for which there is no good reason. |
The other aspect I'll point out is that this PR suggests that if the kube-apiserver goes down, that will force every cilium-agent instance in the cluster to also go down. Similar to how you're asserting that the dataplane should continue without the local control plane, I think it makes sense for the local control plane to continue without the central control plane. |
What's the next step on this PR? I see a few comments suggesting to change some things. @tkna are you still working on this? Do you have any questions for reviewers? |
@joestringer @marseel @YutaroHayakawa @ysksuzuki @ldelossa @learnitall I made some minor corrections.
In my understanding, in @marseel's opinion
|
9ae8b67
to
44c8957
Compare
e0fd664
to
52d8b96
Compare
52d8b96
to
8efc0c8
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
@ldelossa Is there anything else I should do about this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay on this. I think it's almost ready to merge.
Paul was assigned for CLI but he has moved on to other areas, so I gave feedback from a @cilium/cli perspective.
8efc0c8
to
ec24d5a
Compare
This PR adds `requireK8sConnectivity` flag (default: true) to exclude the K8s probe from the result of health probe. This flag can be set by a newly added http header `require-k8s-connectivity` of /healthz cilium API and pod's /healthz endpoint. Signed-off-by: tkna <naoki-take@cybozu.co.jp>
ec24d5a
to
2947e19
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
/ci-ginkgo |
/ci-runtime |
/ci-integration |
Thanks for the contribution @tkna ! |
@joestringer
|
@tkna yes, that would be helpful. When this PR becomes part of a release, the release notes will point back to this PR. If a user is interested in the feature, they may open this PR and read the description. |
@joestringer |
This PR adds
requireK8sConnectivity
flag (default: true) to exclude the K8s probe from the result of health probe.This flag can be set by a newly added http header
require-k8s-connectivity
of /healthz cilium API and pod's /healthz endpoint.Fixes: #31702