-
Notifications
You must be signed in to change notification settings - Fork 3.4k
fix: Adding the fatal error for ipv6 cilium config on a single stack node #28953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f30f597
to
718c26f
Compare
Code itself LGTM, but not sure if continuing silently is a good idea? There's a bunch of downstream logic that runs if Also cc @cilium/sig-foundations since related to |
@ti-mo So if we want to enable/disable ipv6 earlier would that be under |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about "The expectation is that ipv4 should be working even if there are errors in getting ipv6 ips". Do you mean we expect cilium to work without IPv6 even if IPv6 is enabled by user when DirectRoutingDevice for v6 can't be found?
@jschwinger233 What I meant is that even if we don't have a node that supports ipv6 and the user has enabled ipv6 on that node, the cilium pods should not go into crashloopback. |
Seeing the code, I see the usage of both
In the current implementation, the node ip is fetched for each
|
718c26f
to
aa24d40
Compare
Moved the check in the |
aa24d40
to
a9537ea
Compare
a9537ea
to
72dbd32
Compare
Based on the conversation in the issue: #28909 (comment) the check is added to make sure that the uniform behavior is provided irrespective of the KPR value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I think the approach is coherent with #28909 (comment). I left a couple of suggestions inline.
3f4dae2
to
380ba0c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! 🚀
/test |
Cilium pods on a single stack cluster were failing when ipv6 was enabled. The change would make sure that if the underlying node does not have ipv6 configuration enabled it would fail even if KPR is disabled.Earlier, if KPR is enabled, the node would not see a fatal error. Signed-off-by: Vipul Singh <singhvipul@microsoft.com>
380ba0c
to
d1ad232
Compare
@pippolo84 can we merge this ? |
/test |
E2E flake tracked here, rerunning. |
This PR introduced a regression in our environment. #34861 for a detailed description of the problem. |
This reverts commit 2fc52eb. The commit was added with the intent to have an uniform behavior when Cilium is started on a single stack cluster with ipv6 support enabled, despite the KPR setting being enabled or disabled. Specifically, the added check was stopping the agent startup in case of a missing IPv6 address on the node, even when KPR was disabled. However, the IPv6 detection for NodePort, responsible for the startup error in case of missing IPv6 address with KPR enabled, has been removed in cilium#29033. Therefore, there is no reason anymore to stop the agent at startup when the IPv6 support is enabled and the node is missing an IPv6 address. Moreover, the commit above led to some unintended consequences in environments where KPR is disabled and the IPv6 address is set on the node after the Cilium agent is started (see cilium#34861). The same unintended consequence became more frequent after cilium#32381, where we moved later in the daemon startup process the first call to SetDefaultPrefix. That function, among other things, was in charge of setting up a temporary IPv6 Unique Local Address in case the node does not have one yet. Since now the agent does this later (specifically, while configuring IPAM), the check introduced in 2fc52eb sees the IPv6 node address as nil and stops the agent. Therefore, revert the check to avoid the unintended regressions. Related: cilium#28953 Related: cilium#34861 Related: cilium#29033 Related: cilium#32381 Fixes: cilium#37817 Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
This reverts commit 2fc52eb. The commit was added with the intent to have an uniform behavior when Cilium is started on a single stack cluster with ipv6 support enabled, despite the KPR setting being enabled or disabled. Specifically, the added check was stopping the agent startup in case of a missing IPv6 address on the node, even when KPR was disabled. However, the IPv6 detection for NodePort, responsible for the startup error in case of missing IPv6 address with KPR enabled, has been removed in cilium#29033. Therefore, there is no reason anymore to stop the agent at startup when the IPv6 support is enabled and the node is missing an IPv6 address. Moreover, the commit above led to some unintended consequences in environments where KPR is disabled and the IPv6 address is set on the node after the Cilium agent is started (see cilium#34861). The same unintended consequence became more frequent after cilium#32381, where we moved later in the daemon startup process the first call to SetDefaultPrefix. That function, among other things, was in charge of setting up a temporary IPv6 Unique Local Address in case the node does not have one yet. Since now the agent does this later (specifically, while configuring IPAM), the check introduced in 2fc52eb sees the IPv6 node address as nil and stops the agent. Therefore, revert the check to avoid the unintended regressions. Related: cilium#28953 Related: cilium#34861 Related: cilium#29033 Related: cilium#32381 Fixes: cilium#37817 Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
This reverts commit 2fc52eb. The commit was added with the intent to have an uniform behavior when Cilium is started on a single stack cluster with ipv6 support enabled, despite the KPR setting being enabled or disabled. Specifically, the added check was stopping the agent startup in case of a missing IPv6 address on the node, even when KPR was disabled. However, the IPv6 detection for NodePort, responsible for the startup error in case of missing IPv6 address with KPR enabled, has been removed in #29033. Therefore, there is no reason anymore to stop the agent at startup when the IPv6 support is enabled and the node is missing an IPv6 address. Moreover, the commit above led to some unintended consequences in environments where KPR is disabled and the IPv6 address is set on the node after the Cilium agent is started (see #34861). The same unintended consequence became more frequent after #32381, where we moved later in the daemon startup process the first call to SetDefaultPrefix. That function, among other things, was in charge of setting up a temporary IPv6 Unique Local Address in case the node does not have one yet. Since now the agent does this later (specifically, while configuring IPAM), the check introduced in 2fc52eb sees the IPv6 node address as nil and stops the agent. Therefore, revert the check to avoid the unintended regressions. Related: #28953 Related: #34861 Related: #29033 Related: #32381 Fixes: #37817 Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
This reverts commit 2fc52eb. The commit was added with the intent to have an uniform behavior when Cilium is started on a single stack cluster with ipv6 support enabled, despite the KPR setting being enabled or disabled. Specifically, the added check was stopping the agent startup in case of a missing IPv6 address on the node, even when KPR was disabled. However, the IPv6 detection for NodePort, responsible for the startup error in case of missing IPv6 address with KPR enabled, has been removed in #29033. Therefore, there is no reason anymore to stop the agent at startup when the IPv6 support is enabled and the node is missing an IPv6 address. Moreover, the commit above led to some unintended consequences in environments where KPR is disabled and the IPv6 address is set on the node after the Cilium agent is started (see #34861). The same unintended consequence became more frequent after #32381, where we moved later in the daemon startup process the first call to SetDefaultPrefix. That function, among other things, was in charge of setting up a temporary IPv6 Unique Local Address in case the node does not have one yet. Since now the agent does this later (specifically, while configuring IPAM), the check introduced in 2fc52eb sees the IPv6 node address as nil and stops the agent. Therefore, revert the check to avoid the unintended regressions. Related: #28953 Related: #34861 Related: #29033 Related: #32381 Fixes: #37817 Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit dfd8f93 ] This reverts commit 2fc52eb. The commit was added with the intent to have an uniform behavior when Cilium is started on a single stack cluster with ipv6 support enabled, despite the KPR setting being enabled or disabled. Specifically, the added check was stopping the agent startup in case of a missing IPv6 address on the node, even when KPR was disabled. However, the IPv6 detection for NodePort, responsible for the startup error in case of missing IPv6 address with KPR enabled, has been removed in #29033. Therefore, there is no reason anymore to stop the agent at startup when the IPv6 support is enabled and the node is missing an IPv6 address. Moreover, the commit above led to some unintended consequences in environments where KPR is disabled and the IPv6 address is set on the node after the Cilium agent is started (see #34861). The same unintended consequence became more frequent after #32381, where we moved later in the daemon startup process the first call to SetDefaultPrefix. That function, among other things, was in charge of setting up a temporary IPv6 Unique Local Address in case the node does not have one yet. Since now the agent does this later (specifically, while configuring IPAM), the check introduced in 2fc52eb sees the IPv6 node address as nil and stops the agent. Therefore, revert the check to avoid the unintended regressions. Related: #28953 Related: #34861 Related: #29033 Related: #32381 Fixes: #37817 [ Backporter's notes: - resolved conflict in: daemon/k8s/init.go ] Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
This reverts commit 2fc52eb. The commit was added with the intent to have an uniform behavior when Cilium is started on a single stack cluster with ipv6 support enabled, despite the KPR setting being enabled or disabled. Specifically, the added check was stopping the agent startup in case of a missing IPv6 address on the node, even when KPR was disabled. However, the IPv6 detection for NodePort, responsible for the startup error in case of missing IPv6 address with KPR enabled, has been removed in cilium#29033. Therefore, there is no reason anymore to stop the agent at startup when the IPv6 support is enabled and the node is missing an IPv6 address. Moreover, the commit above led to some unintended consequences in environments where KPR is disabled and the IPv6 address is set on the node after the Cilium agent is started (see cilium#34861). The same unintended consequence became more frequent after cilium#32381, where we moved later in the daemon startup process the first call to SetDefaultPrefix. That function, among other things, was in charge of setting up a temporary IPv6 Unique Local Address in case the node does not have one yet. Since now the agent does this later (specifically, while configuring IPAM), the check introduced in 2fc52eb sees the IPv6 node address as nil and stops the agent. Therefore, revert the check to avoid the unintended regressions. Related: cilium#28953 Related: cilium#34861 Related: cilium#29033 Related: cilium#32381 Fixes: cilium#37817 Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit dfd8f93 ] This reverts commit 2fc52eb. The commit was added with the intent to have an uniform behavior when Cilium is started on a single stack cluster with ipv6 support enabled, despite the KPR setting being enabled or disabled. Specifically, the added check was stopping the agent startup in case of a missing IPv6 address on the node, even when KPR was disabled. However, the IPv6 detection for NodePort, responsible for the startup error in case of missing IPv6 address with KPR enabled, has been removed in #29033. Therefore, there is no reason anymore to stop the agent at startup when the IPv6 support is enabled and the node is missing an IPv6 address. Moreover, the commit above led to some unintended consequences in environments where KPR is disabled and the IPv6 address is set on the node after the Cilium agent is started (see #34861). The same unintended consequence became more frequent after #32381, where we moved later in the daemon startup process the first call to SetDefaultPrefix. That function, among other things, was in charge of setting up a temporary IPv6 Unique Local Address in case the node does not have one yet. Since now the agent does this later (specifically, while configuring IPAM), the check introduced in 2fc52eb sees the IPv6 node address as nil and stops the agent. Therefore, revert the check to avoid the unintended regressions. Related: #28953 Related: #34861 Related: #29033 Related: #32381 Fixes: #37817 [ Backporter's notes: - resolved conflict in: daemon/k8s/init.go ] Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
[ upstream commit dfd8f93 ] This reverts commit 2fc52eb. The commit was added with the intent to have an uniform behavior when Cilium is started on a single stack cluster with ipv6 support enabled, despite the KPR setting being enabled or disabled. Specifically, the added check was stopping the agent startup in case of a missing IPv6 address on the node, even when KPR was disabled. However, the IPv6 detection for NodePort, responsible for the startup error in case of missing IPv6 address with KPR enabled, has been removed in #29033. Therefore, there is no reason anymore to stop the agent at startup when the IPv6 support is enabled and the node is missing an IPv6 address. Moreover, the commit above led to some unintended consequences in environments where KPR is disabled and the IPv6 address is set on the node after the Cilium agent is started (see #34861). The same unintended consequence became more frequent after #32381, where we moved later in the daemon startup process the first call to SetDefaultPrefix. That function, among other things, was in charge of setting up a temporary IPv6 Unique Local Address in case the node does not have one yet. Since now the agent does this later (specifically, while configuring IPAM), the check introduced in 2fc52eb sees the IPv6 node address as nil and stops the agent. Therefore, revert the check to avoid the unintended regressions. Related: #28953 Related: #34861 Related: #29033 Related: #32381 Fixes: #37817 [ Backporter's notes: - resolved conflict in: daemon/k8s/init.go ] Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
Please ensure your pull request adheres to the following guidelines:
description and a
Fixes: #XXX
line if the commit addresses a particularGitHub issue.
Fixes: <commit-id>
tag, thenplease add the commit author[s] as reviewer[s] to this issue.
Cilium pods on a single stack cluster were failing when ipv6 was enabled. The expectation is that ipv4 should be working even if there are error in ipv6.
Fixes: #28909