-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Revert "fix: Adding ipv6 check on the node init function" #40143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "fix: Adding ipv6 check on the node init function" #40143
Conversation
cc @vipul-21 |
/test |
Curious what kind of behavior we're aiming to support in Cilium if Cilium is configured to enable IPv6 support but the underlying nodes don't have IPv6 addresses? Is the idea that we can provide pod to pod and pod to node connectivity via IPv6 but route traffic between the nodes and outside the cluster using IPv4? If Cilium is setting up a ULA in this case then it seems like the daemon should hopefully at least handle this situation and not crash due to lack of IPv6 configuration, which is good... but I also wonder if it's meaningful for Cilium to succeed in such a configuration in the first place? |
I see #34861 brings up some ordering issue where initially the IPv6 configuration is nonsensical, but if Cilium tolerates the bad configuration then cloud bootstrapping can complete (presumably using IPv4), then assign an IPv6 address which maybe Cilium will detect and reconcile? |
I guess technically with this patch you could deploy Cilium using IPv4, no IPv6 address assigned to the node, then configure Cilium for single-stack IPv6-only, then Cilium will assign an IPv6 ULA and then maybe Cilium can provide overlay connectivity for an IPv6-only cluster using an IPv4 underlay? I guess that's plausible, even if it feels like the user would face other problems due to this configuration 🤔 |
My understanding is that #34861 made up a legit case to support starting up without an IPv6 even though ipv6 is enabled on the node. This was possible before the patch in 2fc52eb that ultimately seems a bit too aggressive in failing the daemon startup in case no IPv6 is configured. I think we should probably follow up with a clear agreement about the expectations for this case, but in the meantime I still think the best course of action is to revert 2fc52eb, given that the code regarding the original error is changed and in the current context it seems to me it is not fulfilling its original intent. Alternatively, what about logging a warning without returning an error? |
I think that a well-worded warning or error message could communicate the risk clearly enough. Probably something along the lines "IPv6 is enabled, but Cilium cannot find the IPv6 address for this node. This may cause connectivity disruption for Endpoints that attempt to communicate using IPv6." ? |
This reverts commit 2fc52eb. The commit was added with the intent to have an uniform behavior when Cilium is started on a single stack cluster with ipv6 support enabled, despite the KPR setting being enabled or disabled. Specifically, the added check was stopping the agent startup in case of a missing IPv6 address on the node, even when KPR was disabled. However, the IPv6 detection for NodePort, responsible for the startup error in case of missing IPv6 address with KPR enabled, has been removed in cilium#29033. Therefore, there is no reason anymore to stop the agent at startup when the IPv6 support is enabled and the node is missing an IPv6 address. Moreover, the commit above led to some unintended consequences in environments where KPR is disabled and the IPv6 address is set on the node after the Cilium agent is started (see cilium#34861). The same unintended consequence became more frequent after cilium#32381, where we moved later in the daemon startup process the first call to SetDefaultPrefix. That function, among other things, was in charge of setting up a temporary IPv6 Unique Local Address in case the node does not have one yet. Since now the agent does this later (specifically, while configuring IPAM), the check introduced in 2fc52eb sees the IPv6 node address as nil and stops the agent. Therefore, revert the check to avoid the unintended regressions. Related: cilium#28953 Related: cilium#34861 Related: cilium#29033 Related: cilium#32381 Fixes: cilium#37817 Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Log a warning when the node IPv6 address cannot be found at startup and the IPv6 support is enabled. The agent is not stopped since an IPv6 can still be assigne later to the node (see cilium#34861 as an example of such scenario) Suggested-by: Joe Stringer <joe@cilium.io> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
8712049
to
fe6755b
Compare
/test |
This reverts commit 2fc52eb.
The commit was added with the intent to have an uniform behavior when Cilium is started on a single stack cluster with ipv6 support enabled, despite the KPR setting being enabled or disabled. Specifically, the added check was stopping the agent startup in case of a missing IPv6 address on the node, even when KPR was disabled.
However, the IPv6 detection for NodePort, responsible for the startup error in case of missing IPv6 address with KPR enabled, has been removed in #29033. Therefore, there is no reason anymore to stop the agent at startup when the IPv6 support is enabled and the node is missing an IPv6 address. Moreover, the commit above led to some unintended consequences in environments where KPR is disabled and the IPv6 address is set on the node after the Cilium agent is started (see #34861).
The same unintended consequence became more frequent after #32381, where we moved later in the daemon startup process the first call to SetDefaultPrefix. That function, among other things, was in charge of setting up a temporary IPv6 Unique Local Address in case the node does not have one yet. Since now the agent does this later (specifically, while configuring IPAM), the check introduced in 2fc52eb sees the IPv6 node address as nil and stops the agent.
Therefore, revert the check to avoid the unintended regressions.
Related: #28953
Related: #34861
Related: #29033
Related: #32381
Fixes: #37817