Skip to content

Conversation

brb
Copy link
Member

@brb brb commented Mar 7, 2022

See commit msgs.

Fix #19010

For v1.10 backport: feel free to drop tests, as many merge conflicts are expected and we will have a good coverage in the v1.11/main.

Previously, when running in the tunneling mode, the NodePort BPF used
SECLABEL (=HOST_ID) in the encapsulated request's header which was
forwarded to a node running a selected service endpoint. This made the
node to believe, that the request originated from the in-cluster node.
The side-effect of this was that any netpol which denied access from
outside to the service endpoint could been bypassed.

Fix this by propagating the WORLD_ID instead of the HOST_ID in the
encapsulated request.

Reported-by: Rahul Sharma
Signed-off-by: Martynas Pumputis <m@lambda.lt>
@brb brb added kind/bug This is a bug in the Cilium logic. area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. needs-backport/1.10 labels Mar 7, 2022
@brb brb requested review from borkmann and a team March 7, 2022 11:13
@brb brb requested a review from a team as a code owner March 7, 2022 11:13
@brb brb requested review from a team and tklauser March 7, 2022 11:13
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 7, 2022
Signed-off-by: Martynas Pumputis <m@lambda.lt>
@brb
Copy link
Member Author

brb commented Mar 7, 2022

/test

Job 'Cilium-PR-K8s-1.21-kernel-5.4' has 1 failure but they might be new flake since it also hit 1 known flake: #18873 (96.81)

@brb brb added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Mar 7, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 7, 2022
@borkmann
Copy link
Member

borkmann commented Mar 7, 2022

Short q: If you use kube-proxy, does the remote backend node see WORLD_ID as well for this traffic (or HOST_ID)?

@brb
Copy link
Member Author

brb commented Mar 7, 2022

Short q: If you use kube-proxy, does the remote backend node see WORLD_ID as well for this traffic (or HOST_ID)?

I validated it manually - when running with kube-proxy w/o kube-proxy-replacement, the bpf_host @ on cilium_host sets the correct sec ID (i.e., WORLD_ID).

@brb
Copy link
Member Author

brb commented Mar 7, 2022

/test-1.23-net-next

@@ -574,6 +574,57 @@ Secondary Interface %s :: IPv4: (%s, %s), IPv6: (%s, %s)`,
testCurlFromOutsideWithLocalPort(kubectl, ni, svc2URL, 1, false, 64002)
})

It("Tests security id propagation in N/S LB requests fwd-ed over tunnel", func() {
// This test case checks whether the "wold" identity is passed in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: wold

Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this makes the behaviour more consistent with tunneling mode with kube-proxy, so I like this proposal for that reason. At the same time, I believe that this makes an explicit divide between how tunnel mode handles this case vs. how direct-routing mode handles the case, where direct-routing mode cannot know/transfer this information and hence will always consider such nodeport traffic from other nodes to be 'remote-node', whereas in tunneling environments Cilium would know better.

Any time we change a security identity for something, we need to be a bit careful about how we handle the communication / upgrade story. Even if users today are not consciously allowing traffic from outside the cluster using a fromEntities: - cluster / remote-node policy, it's possible that they are relying on that policy behaviour. Changing this may force them to re-evaluate which policies they have installed. This could cause production impact upon upgrade.

For this reason, I'm a bit nervous - even wondering whether it's something we need a (hidden) flag for, to mitigate upgrade issues.

One concrete step we should at least do is to release-note this clearly, something like Do not allow nodeport traffic via "cluster" fromEntities policy in tunneling mode with KPR. We probably want to expand on this a bit more to help to highlight that users will need to be paying close attention to their policies for nodeports when they upgrade. What kinds of policies would be affected? What is the recommended way to solve policy drops if you upgrade & find that nodeports don't work any more? etc.

I even wonder if this should not be backported so far in order to reduce the impact for users on earlier Cilium versions, in case they pick up the new micro/patch releases and do not realise it has this impact.

@joestringer joestringer added the upgrade-impact This PR has potential upgrade or downgrade impact. label Mar 7, 2022
@brb
Copy link
Member Author

brb commented Mar 8, 2022

At the same time, I believe that this makes an explicit divide between how tunnel mode handles this case vs. how direct-routing mode handles the case, where direct-routing mode cannot know/transfer this information and hence will always consider such nodeport traffic from other nodes to be 'remote-node', whereas in tunneling environments Cilium would know better.

@joestringer Good points. My next step is to better inform users in the documentation (in the KPR and the netpol pages) about the diff, and also to present how the identity can be propagated in the direct-routing mode (i.e., DSR and externalTrafficPolicy=Local).

Any time we change a security identity for something, we need to be a bit careful about how we handle the communication / upgrade story. Even if users today are not consciously allowing traffic from outside the cluster using a fromEntities: - cluster / remote-node policy, it's possible that they are relying on that policy behaviour. Changing this may force them to re-evaluate which policies they have installed. This could cause production impact upon upgrade.

Maybe we should indeed drop the needs backport label, make the change only in v1.12, and then clearly communicate in the upgrade guide about the change?

For this reason, I'm a bit nervous - even wondering whether it's something we need a (hidden) flag for, to mitigate upgrade issues.

I'd consider adding the new flag as the last resort :-)

One concrete step we should at least do is to release-note this clearly, something like Do not allow nodeport traffic via "cluster" fromEntities policy in tunneling mode with KPR. We probably want to expand on this a bit more to help to highlight that users will need to be paying close attention to their policies for nodeports when they upgrade. What kinds of policies would be affected? What is the recommended way to solve policy drops if you upgrade & find that nodeports don't work any more? etc.

Yeah, ACK. The documentation is planned as the follow-up. I just wanted to get it merged quicker, so that users could start testing. My gut feeling is that the doc changes might require a few reiterations.

@brb
Copy link
Member Author

brb commented Mar 8, 2022

/test-1.21-5.4

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. upgrade-impact This PR has potential upgrade or downgrade impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

inconsistent result with cilium network policies when kube-proxy is enabled or disabled
7 participants