-
Notifications
You must be signed in to change notification settings - Fork 3.4k
cilium: encryption, fixes for ENI & Azure mode with shared podIPs and networkIPs #15048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This comment has been minimized.
This comment has been minimized.
I'll pop patch1 off in a bit it fixes a build env issue on my system |
This comment has been minimized.
This comment has been minimized.
pchaigno
reviewed
Feb 22, 2021
@joestringer We need to do this in two steps. First get the extra option in the netlink code. Here is the PR cilium/netlink#1 |
0cfaf5a
to
e3734eb
Compare
011c9b9
to
740c783
Compare
test-me-please |
This was referenced Feb 24, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/encryption
Impacts encryption support such as IPSec, WireGuard, or kTLS.
release-note/bug
This PR fixes an issue in a previous release of Cilium.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Subnet mode came with some baked in assumptions that are not met in all ENI use cases. First we assumed the networkIPs would not overlap with the podIPs. This was important because we use the same routing table for both encrypt and decrypt with the above assumption this works because encrypted packets and decrypted packets will match different routes and never collide. But, once this assumption is broke depending on how subnets and IPs are chosen its possible they collide. The result is a encrypted packet may be routed as if it has already been decrypted or vice versa. We've seen some cases where the overlap happens in EKS and AKS.
Next we made an assumption that cilium_host interface would always understand how to handle packets. In ENI mode with endpoint routes enabled we optimized cilium_host recently so it would not handle ingress packets. However, we missed the subnet encryption case combination of features. Fix it here by also enabling subnet encryption to work with endpoint routes. We took this opportunity to make a small optimization and cut out an extra pass through bpf_network as it was no longer needed with above fixes.
Finally, feature detection incorrectly detected the ingress interface in some cases.
Quick note, I broke the patches down into small chunks. Other than the refactor patches they are hopefully small and straightforward. Patch List
patches 1- 2 refactor code so we can push optional flag down to policy rules and program fwd policy directly
patch3: set optional flag on all fwd policies, we don't use fwd polices for policy (to drop packets not encrypted for example) because we can enforce these policies in the BPF programs and drop any non-compliant packets early in the stack at XDP or TC layers. Without this endpoint routes + one CIDR for network and podIPs can cause a drop.
patch4: subnetEncryption was always enabling nodeEncrypt even if it wasn't configured this is wrong.
patch5: ENI needs to use linkAddr for IPSec tunnel IPs
patch6: ENI, redirecting to cilium_host breaks endpoint routes
patch7: simplify stack to avoid extra loop by having xfrm stack clear mark bits.
patch8: remove mark from fwd rule to make it more permissive again we don't use it for policy
patch9: clear mark from xfrm states so we can skip extra pass through eth0
patch10: ENI auto iface detection doesn't in my deployment it picks the wrong interface, this makes it so we use the user supplied interface if its given.
Couple pictures to help with understanding ingress datapath.
Picture of possible failing case if subnet IPs overlap and then collide with interface IPs.
Picture of fixed datapath to support overlapping subnet IPs.