-
Notifications
You must be signed in to change notification settings - Fork 3.4k
doc: Added hostLegacyRouting limitation for Talos #36852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: Added hostLegacyRouting limitation for Talos #36852
Conversation
When using tunneling mode and enabling `Kube-Proxy replacement<kubeproxy-free>` | ||
together with Talos' `Forwarding kube-dns to Host DNS`_ (enabled by default | ||
since Talos 1.8+), you need to set ``bpf.hostLegacyRouting`` to ``true`` as | ||
DNS won't work otherwise. This behaviour was introduced in Cilium 1.16.5+. | ||
See `cilium/cilium#35098`_ for more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we break this configuration in a newer release? Trying to understand whether this is something we should be documenting or fixing. Maybe @jschwinger233 has more context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#35098 fixed a bug been there since 1.14, it never meant to break anything, unless something counting on that bug 😬
I don't have much knowledge about cilium with talos, but I have a feeling that, what we really want to address is to support cilium-talos with bpf host routing. They worked together in the past accidentally based on a bug, now that the bug is gone, we should figure out what's missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the impression this is more about the "node-local DNS" use case than Talos in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting the feeling from some of the reports around this issue that users were relying on the bug fixed by #35098, and the new bug now revealed is worse than the one you had fixed. Even if ultimately the decision is "users shouldn't be running bpf host routing yet because we're still working on stability in some cases", as a user it doesn't feel good if the project asks you to make some change to their configuration in a bugfix release because one of the other bugfixes actually broke the environment.
One good piece of news is that with this increased feedback, we are more likely to be able to create a reproducer test case to detect the new bug, and avoid regression in this area in future.
That said, if we think the current state is more severe than what we had before, it may be worth reverting #35098 to bring back the previous behavior and only roll out that fix in v1.17 along with changes to the recommended configuration to ensure that users don't trigger the newly discovered bug during upgrade (+ release note).
Of course if we think we have a good understanding of the latest report and can come up with a fix for that, then an alternative option is just to work towards that and apply that fix everywhere that we applied #35098.
This issue is not limited to Talos: #36761 |
bfa40e6
to
add9762
Compare
Thanks all for the input! @joestringer, @julianwiedmann: WDYT about the newest version where I only address the Talos-specific limitation of using "Forwarding kube-dns to Host DNS" together with Cilium's BPF host routing? |
Talos Linux's 'Forwarding kube-dns to Host DNS' functionality doesn't work together with Cilium's BPF host routing. Signed-off-by: Philip Schmid <phisch@cisco.com>
add9762
to
5db0821
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thank you!
/test |
Talos Linux's Forwarding kube-dns to Host DNS functionality doesn't work together with Cilium's BPF host routing (Linux kernel 5.10+ and
kubeProxyReplacement=true
andbpf.masquerade=true
).It might have worked unintentionally in some scenarios before Cilium 1.16.5, but with the fix from #35098, it stopped without also configuring
bpf.hostLegacyRouting=true
.