-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix dropped NodePort traffic to hostNetwork backends with Geneve+DSR #36978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix dropped NodePort traffic to hostNetwork backends with Geneve+DSR #36978
Conversation
Thank you! I'll have a closer look later. Also queued up #36982. Once that's merged, we can rebase this PR so that we get immediate feedback on the connectivity test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Merged, I'll auto-rebase the branch... |
ce585d8
to
aa75d28
Compare
/ci-e2e-upgrade |
Thanks for the quick look! |
@julianwiedmann fixed the conflict. Any tips on next steps to get this in? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments inline, before I forgot about the CLI compatibility aspect again ...
In general - please rebase on-top of
IIRC the CLI tests for cluster-internal connectivity to all the nodeport services on startup, before running the connectivity tests. Is that what's failing? Let's re-run the workflow and find out. |
@dylandreimerink Thanks! Done! |
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but asking peers from cilium/ipcache to have a second look regarding the addition of a new field to IPCache
. cc @squeed
@squeed mentioned that's fine to add a new field to |
Thanks for the fix! Merging. |
…onfig With cilium#36978 the IPcache entry for a remote node now contains a tunnel_endpoint, typically in combination with the `skip_tunnel` flag set. But when dealing with replies for a connection that was established via the overlay network, we want to continue routing those replies via the overlay. Also fix up the corresponding BPF test. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
With cilium#36978 the IPcache entry for a remote node now contains a tunnel_endpoint, typically in combination with the `skip_tunnel` flag set. But when dealing with replies for a connection that was established via the overlay network, we want to continue routing those replies via the overlay. Also fix up the corresponding BPF test. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
This reproduces disruption for an established pod-to-remote-node connection, caused by the changes in cilium#36978. In old code such a connection would be masqueraded, but in new code the masquerading is skipped. Thus the destination will see requests with different source IPs. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Sorry, I think we'll have to revert this from |
I've queued up the revert here. I actually like the approach of always setting the But the way we currently handle pod-to-remote-node traffic (SNAT instead of tunnel), along with how the |
@julianwiedmann sorry to hear that! |
I’m also curious if you could expand (or point to docs) on the original intent of that flag |
…nfig With cilium#36978 the IPcache entry for a remote node now contains a tunnel_endpoint, typically in combination with the `skip_tunnel` flag set. But when dealing with replies for a connection that was established via the overlay network, we want to continue routing those replies via the overlay. Also fix up the corresponding BPF test. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
…unnel With cilium#36978 the IPcache entry for a remote node now contains a tunnel_endpoint, typically in combination with the `skip_tunnel` flag set. But when dealing with replies for a connection that was established via the overlay network, we want to continue routing those replies via the overlay. Also fix up the corresponding BPF test for inter-cluster-SNAT. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
…unnel With #36978 the IPcache entry for a remote node now contains a tunnel_endpoint, typically in combination with the `skip_tunnel` flag set. But when dealing with replies for a connection that was established via the overlay network, we want to continue routing those replies via the overlay. Also fix up the corresponding BPF test for inter-cluster-SNAT. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
This PR fixes DSR dispatch with Geneve to hostNetworked pods by ensuring tunnel_endpoint is always populated in ipcache, even for directly reachable endpoints. Unneeded encapsulation is avoided by passing the skiptunnel flag for such endpoints.
Fixes: #36901