-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix CIDR to World Entity Conversion Bug #22625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @nathanjsweet just a heads-up that since we have branched v1.13, this PR will not be part of the 1.13 unless it is set with the needs-backport/1.13 label.
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! LGTM.
Smoke tests are failing with agents reporting Other failures are non-required builds ConformanceGKE and Cilium Datapath. @nathanjsweet can you rebase this PR so we can see that the smoke tests can pass? |
18fa7b3
to
eca53f7
Compare
Ooh, I found a magic "Update with rebase" button on GitHub, so I clicked it. Let's see. |
/test |
@nathanjsweet it looks like the Cilium Datapath tests are failing consistently on a CIDR deny policy issue. Do you think that this may be related to the PR content? I don't see any similar failures in other recent failures for the workflow. |
I checked one of the failures and it does include a policy with the 0.0.0.0/0 CIDR. Since all tests are failing with the exact same test cases, I assume they all have the same policy. Seems fairly likely to be related to the changes here. |
eca53f7
to
c600cc9
Compare
@nathanjsweet were you able to reproduce / track down the datapath failure? |
@joestringer no, not yet. I was waiting to see if this fixed it. |
/test Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking on above conversation with Joe and CI failure. I suspect the two are related 🙂
/test |
This pull request has been automatically marked as stale because it |
I meant to update this PR, but never did. This PR is also blocked on #22966 as it significantly alters the policy engine code. Making this change should happen after that PR is merged. |
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from proxy side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest changes for encap look good, and there's one decap case handled, then I think there's a couple more remaining decap cases in bpf/bpf_overlay.c:handle_ipv4()
and bpf/bpf_overlay.c:handle_ipv6()
, both for the high scale and regular tunnel decap cases without encryption.
These suggest to me that we don't have policy coverage for an IPv4-only 0.0.0.0/0
allow for pods where the traffic is received from outside the cluster through a nodeport on a different node. That sort of test should be able to highlight a policy deny issue with this latest implementation.
12bb6d6
to
c5928d5
Compare
c5928d5
to
9aaca8b
Compare
/test |
Previously if either the "0.0.0.0/0" CIDR (hereafter "world-ipv4") or a world-ipv6 CIDR was present the selector creation logic for CIDRs would also add the "reserved:world" entity to the selector labels generated by the CIDR. This does not work when Cilium is dual-stack mode as "reserved:world" covers both IPv4 and IPv6, so both CIDRs should be present before such an addition is made. Moreover, dual-stack mode lacked any mechanism to distinguish world-ipv4 and world-ipv6 from another. This bug has prevented the possibility of managing world-ipv4 and world-ipv6 separately from one another in the policy engine. This inccorect assumption goes all the way to the datapath, where SECLABEL and WORLD_ID were part of drop notification, policy logic, and tunnel logic. SECLABLE and WORLD_ID were both set by userspace compilation logic. Separate identities (and labels, etc) for world-ipv4 and world-ipv6 now exist so they can be distinguished in userspace and in the datapath. Some utility methods have been added in userspace and datapath to check for the necessity of using world-ipv4 and world-ipv6 identities. Explicit tests for dual-stack mode have been added. The generic WORLD_ID is still used as the VNI tag for all tunnel logic in the datapath. When tunnel code is encapsulating any WORLD_IPX_ID identity it will convert the VNI to WORLD_ID. When tunnel code is decapsulating any WORLD_ID VNI it will convert the VNI to either WORLD_IPV4_ID or WORLD_IPV6_ID based on the underlying packet protocol. Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
9aaca8b
to
21274a7
Compare
/test |
Previously, we had a single label, `reserved:world`, applied to all CIDR (non-cluster) identities. With cilium#22625, this is no longer the case; CIDR identities get either `reserved:world-ipv4` or `reserved:world-ipv6` labels **in dual stack clusters**. This PR updates the `toEntities: world` selector to select *all* world entities, as it worked previously. It does so by expanding the set of selectors underlying the `world` entity. No other magic in the SelectorCache or policy engine is required. Fixes: cilium#29666 Fixes: a94fa56 Signed-off-by: Casey Callendrello <cdc@isovalent.com>
Previously, we had a single label, `reserved:world`, applied to all CIDR (non-cluster) identities. With #22625, this is no longer the case; CIDR identities get either `reserved:world-ipv4` or `reserved:world-ipv6` labels **in dual stack clusters**. This PR updates the `toEntities: world` selector to select *all* world entities, as it worked previously. It does so by expanding the set of selectors underlying the `world` entity. No other magic in the SelectorCache or policy engine is required. Fixes: #29666 Fixes: a94fa56 Signed-off-by: Casey Callendrello <cdc@isovalent.com>
[ upstream commit 1288473 ] Previously, we had a single label, `reserved:world`, applied to all CIDR (non-cluster) identities. With cilium#22625, this is no longer the case; CIDR identities get either `reserved:world-ipv4` or `reserved:world-ipv6` labels **in dual stack clusters**. This PR updates the `toEntities: world` selector to select *all* world entities, as it worked previously. It does so by expanding the set of selectors underlying the `world` entity. No other magic in the SelectorCache or policy engine is required. Fixes: cilium#29666 Fixes: a94fa56 Signed-off-by: Casey Callendrello <cdc@isovalent.com>
[ upstream commit 1288473 ] Previously, we had a single label, `reserved:world`, applied to all CIDR (non-cluster) identities. With #22625, this is no longer the case; CIDR identities get either `reserved:world-ipv4` or `reserved:world-ipv6` labels **in dual stack clusters**. This PR updates the `toEntities: world` selector to select *all* world entities, as it worked previously. It does so by expanding the set of selectors underlying the `world` entity. No other magic in the SelectorCache or policy engine is required. Fixes: #29666 Fixes: a94fa56 Signed-off-by: Casey Callendrello <cdc@isovalent.com>
if (*identity != WORLD_ID) | ||
if (identity_is_world_ipv4(*identity)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it reversed the logic, breaking VTEP support, see #31023.
[ upstream commit 1288473 ] Previously, we had a single label, `reserved:world`, applied to all CIDR (non-cluster) identities. With #22625, this is no longer the case; CIDR identities get either `reserved:world-ipv4` or `reserved:world-ipv6` labels **in dual stack clusters**. This PR updates the `toEntities: world` selector to select *all* world entities, as it worked previously. It does so by expanding the set of selectors underlying the `world` entity. No other magic in the SelectorCache or policy engine is required. Fixes: #29666 Fixes: a94fa56 Signed-off-by: Casey Callendrello <cdc@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
Previously if either the "0.0.0.0/0" CIDR (hereafter "world-ipv4")
or a world-ipv6 (i.e. "::/0") CIDR was present the selector creation logic
for CIDRs would also add the "reserved:world" entity to the
selector labels generated by the CIDR for the sake of the policy engine.
This does not work when Cilium is dual-stack mode as the "reserved:world"
entity covers both IPv4 and IPv6, so both CIDRs should be present before such
an addition to the policy engine logic is made. Moreover, dual-stack mode lacks
any mechanism to distinguish world-ipv4 and world-ipv6 from one another.
This bug has prevented the possibility of managing world-ipv4 and world-ipv6
separately from one another in network policies.
Separate identities (and labels, etc) for world-ipv4 and world-ipv6 now exist
so they can be distinguished.
Signed-off-by: Nate Sweet nathanjsweet@pm.me
Fixes: #22610