Skip to content

Conversation

jrajahalme
Copy link
Member

@jrajahalme jrajahalme commented Jul 22, 2024

Deny entries covered by another deny CIDR entry must be kept as they have different numerical identities in the datapath. Otherwise a later added allow-all entry could take effect if there is no specific match for the left-off identity. Simplest example of this is with L3-only deny rules and an Allow-all rule that wildcards also the L3. So for the sake if this
example we only need to consider L3-identities of the CIDRs (10/8, and 10.1.1.1/32, first covering the latter), say X and Y (both nonzero) and an allow rule with ID 0 (ANY). This policy should allow all traffic that is not with an IP covered by 10/8. Since 10.1.1.1/32 is covered by 10/8 it would be tempting to not insert the rule for the ID assigned to CIDR 10.1.1.1/32 to the datapath policy map. The identities for both CIDRs are, however, present in the ipcache, so that traffic to (or from)
10.1.1.1 would be associated with ID Y, which would match the wildcard ID (0) allow rule, if a deny rule with the ID Y is not present in the datapath. Hence, we must have both deny rules in to the policy map.

The above logic does not apply when the two rules in question have the same ID (but different coverage on port and protocol), or when one of them is a wildcard L3 rule (with ID 0). In that case we can leave the more specific deny rule covered by another off the policy map, but only if it was not needed to cover for a L4-only allow rule that still needs
to take effect on all L3 identities not explicitly denied. In that case we add new L3/L4 deny entries to carve holes into the broader allow rule, and we must keep the deny rules we added for those holes even when they would otherwise be covered by a broader deny rule.

Note that a presence of a wider allow rule is not required at the time of policy map entry insertion for the above consideration to take effect. In this line of thought we could simply leave off all the deny entries if no allow entries exist, since the datapath is deny by default. This breaks down as soon as an allow rule is inserted to a now empty policy map
(maybe incrementally due to an FQDN policy rule), as all the (non-inserted) deny rules would now be disregarded.

This leaves a possibility for a late-stage policy map optimization, however: If, when syncing the userspace policy map to the bpf datapath policy map, we notice that some deny entries are not covered by any wider allow entries, it would be safe to leave such entries off from the datapath policy map. We'd need to be ready to insert them before any new related allow entries are added, though.

Also, when adding an allow entry that is a subset (in L3-wildcard or CIDR sense) of an existing deny entry and the new allow entry has a less specific port-protocol than the deny entry (e.g., wildcard L4 allow and TCP deny) then an additional deny entry with the identity of the new allow entry must be added. This ensures that the deny rule takes effect also for the subset identity at the port/proto of the deny rule.

Unit tests are adjusted, not only to include the previously missing entries, but also add "allow-all" rules for illustrative purposes so that it is easier to reason about the necessity of the added deny entries.

Fixes: #32675

Note: This is the main version of #33719

Increase test coverage by reversing insertion order of mapstate entries
in some tests.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Refactor denyPreferredInsertWithChanges to check for cases where we can
avoid inserting any changes first. This is marginally faster, and makes
the next commit diff easier to read.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Deny entries covered by another deny CIDR entry must be kept as they have
different numerical identities in the datapath. Otherwise a later added
allow-all entry could take effect if there is no specific match for the
left-off identity. Simplest example of this is with L3-only deny rules
and an Allow-all rule that wildcards also the L3. So for the sake if this
example we only need to consider L3-identitites of the CIDRs (10/8, and
10.1.1.1/32, first covering the latter), say X and Y (both nonzero) and
an allow rule with ID 0 (ANY). This policy should allow all traffic that
is not with an IP covered by 10/8. Since 10.1.1.1/32 is covered by 10/8
it would be tempting to not insert the rule for the ID assigned to CIDR
10.1.1.1/32 to the datapath policy map. The identities for both CIDRs
are, however, present in the ipcache, so that traffic to (or from)
10.1.1.1 would be associated with ID Y, which would match the wildcard ID
(0) allow rule, if a deny rule with the ID Y is not present in the
datapath. Hence, we must have both deny rules in to the policy map.

The above logic does not apply when the two rules in question have the
same ID (but different coverage on port and protocol), or when one of
them is a wildcard L3 rule (with ID 0). In that case we can leave the
more specific deny rule covered by another off the policy map, but only
if it was not needed to cover for a L4-only allow rule that still needs
to take effect on all L3 identities not explicitly denied. In that case
we add new L3/L4 deny entries to carve holes into the broader allow rule,
and we must keep the deny rules we added for those holes even when they
would otherwise be covered by a broader deny rule.

Note that a presence of a wider allow rule is not required at the time of
policy map entry insertion for the above consideration to take effect. In
this line of thought we could simply leave off all the deny entries if no
allow entries exist, since the datapath is deny dy default. This breaks
down as soon as an allow rule is inserted to a now empty policy map
(maybe incrementally due to an FQDN policy rule), as all the
(non-inserted) deny rules would now be disregarded.

This leaves a possibility for a late-stage policy map optimization,
however: If, when syncing the userspace policy map to the bpf datapath
policy map, we notice that some deny entries are not covered by any wider
allow entries, it would be safe to leave such entries off from the
datapath policy map. We'd need to be ready to insert them before any new
related allow entries are added, though.

Also, when adding an allow entry that is a subset (in L3-wildcard or CIDR
sense) of an exising deny entry and the new allow entry has a less
specific port-protocol than the deny entry (e.g., wildcard L4 allow and
TCP deny) then an additional deny entry with the identity of the new
allow entry must be added. This ensures that the deny rule takes effect
also for the subset identity at the port/proto of the deny rule.

Unit tests are adjusted, not only to include the previously missing
entries, but also add "allow-all" rules for illustrative purposes so that
it is easier to reason about the necessity of the added deny entries.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Make use of the allow-all rule in
TestMapState_denyPreferredInsertWithSubnets explicit and test with and
without it to increase test coverage.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
@jrajahalme jrajahalme added kind/bug This is a bug in the Cilium logic. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. release-note/misc This PR makes changes that have no direct user impact. labels Jul 22, 2024
@jrajahalme jrajahalme requested a review from a team as a code owner July 22, 2024 19:18
@jrajahalme jrajahalme requested a review from doniacld July 22, 2024 19:18
@jrajahalme jrajahalme added the backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. label Jul 22, 2024
@jrajahalme
Copy link
Member Author

/test

@jrajahalme jrajahalme added backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. labels Jul 23, 2024
@nathanjsweet nathanjsweet self-requested a review July 24, 2024 15:40
@jrajahalme
Copy link
Member Author

jrajahalme commented Aug 1, 2024

@nathanjsweet Note for review: Due to the way some of the tests are structured, I did not think to consider the fact that when a CIDR covers an another CIDR or a resolved FQDN IP in the policy, the identity allocated for the covered IP is also selected by the selector of the covering CIDR.

For example:

Policy rules with selectors:

allow WORLD -> selector { reserved:world }
deny 10/8 -> selector { cidr:10.0.0.0/8 }
allow www.example.com -> selector { fqdn:www.example.com }

Reserved and allocated numeric identities with labels for CIDRs in the policy:

WORLD -> 2: labels { reserved:world }
10/8 -> Y: labels { reserved:world, cidr:10.0.0.0/8 }

Numeric identities matched by each selector BEFORE DNS resolution for www.example.com:

selector { reserved:world }: [2, X]
selector { cidr:10.0.0.0/8 }: [X]
selector { fqdn:www.example.com }: []

Allocated numeric identities with labels for FQDNs in the policy AFTER DNS resolution via Cilium DNS proxy:

10.1.1.1 -> Y: labels { reserved:world, cidr:10.0.0.0/8, fqdn:www.example.com }

Numeric identities matched AFTER DNS resolution by each selector:

selector { reserved:world }: [2, X, Y]
selector { cidr:10.0.0.0/8 }: [X, Y]
selector { fqdn:www.example.com }: [Y]

This means that when adding the deny rule, keys with both identities [X, Y] will be inserted into the mapstate, and will override the allow keys with the same IDs. Hence it would seem that looking for subset identities for X would be redundant, as Y is going to be added as a deny anyway. Currently this does not work reliably, though, as selector cache is not transactional, meaning that there is no guarantee that selector { cidr:10.0.0.0/8 } in the selector cache actually selects both [X, Y] at the same time (or during the same policy computation) as when selector { fqdn:www.example.com } already selects [Y]. This is true for both full policy computation (which uses GetSelections()), and incremental updates.

Once we make selector cache transactional, i.e., identity Y appears seemingly simultaneously on both selectors, we can remove all of the superset/subset logic between different non-WORLD IDs. But for now it is still needed to avoid transient policy bugs.

@nathanjsweet nathanjsweet added this pull request to the merge queue Aug 2, 2024
Merged via the queue into cilium:main with commit 6186d57 Aug 2, 2024
74 checks passed
aanm pushed a commit that referenced this pull request Oct 8, 2024
With the policy as shown by [1] applied, traffic to 1.1.1.2/32 would be
denied but only because of the default-deny fallback, and not have a
policy correlation.  In Hubble, this would show up as
`policy-verdict:none` in the flow instead of "Policy denied by
denylist".

In the denyPreferredInsertWithChanges() logic for v1.15 and below (v1.16
and main unaffected), the "deny-deny" [2] logic removed what the policy
engine considered a redundant deny entry that leads to the missing
policy correlation described above.

The fix is to minimally cherry-pick 317a8af ("policy: Keep deny
entries when covered by another CIDR deny") [from the v1.16 tree]. This
cherry-picked commit is built off of a2ff24f ("policy: Fix Deny
Insertion Logic"), which was found by bisecting v1.15 and v1.16 to
identify when this bug was "fixed" in the later versions of Cilium.

In addition, this commit contains another minimal cherry-pick from
317a8af ("policy: Keep deny entries when covered by another CIDR
deny") [from the v1.16 tree] for the "deny-allow" [3] and "allow-deny"
[4] case. [3] fixes a transient bug as explained by
#33953 (comment).

For more context on the changes to the policy engine between versions,
tracing the commit history from the commits mentioned above would be the
best way.

[1]:

```
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow
spec:
  egress:
  - toCIDR:
    - 1.1.1.1/32
    - 1.1.1.2/32
  endpointSelector: {}
---
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: deny
spec:
  egressDeny:
  - toCIDRSet:
    - cidr: 1.0.0.0/8
      except:
      - 1.1.1.1/32
  endpointSelector: {}
```

[2]: "Deny-deny" refers to the policy engine code path inside
     denyPreferredInsertWithChanges() under

```
if newEntry.IsDeny {
  ...
  ms.ForEachDeny()
  ...
}
```

[3]: Similar to [2], "deny-allow" refers to

```
if newEntry.IsDeny {
  ...
  ms.ForEachAllow()
  ...
}
```

[4]: Similar to [3], "allow-deny" refers to (note the ! in the
condition)

```
if !newEntry.IsDeny {
  ...
  ms.ForEachDeny()
  ...
}
```

Signed-off-by: Chris Tarazi <chris@isovalent.com>
Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
aanm pushed a commit that referenced this pull request Oct 8, 2024
With the policy as shown by [1] applied, traffic to 1.1.1.2/32 would be
denied but only because of the default-deny fallback, and not have a
policy correlation.  In Hubble, this would show up as
`policy-verdict:none` in the flow instead of "Policy denied by
denylist".

In the denyPreferredInsertWithChanges() logic for v1.15 and below (v1.16
and main unaffected), the "deny-deny" [2] logic removed what the policy
engine considered a redundant deny entry that leads to the missing
policy correlation described above.

The fix is to minimally cherry-pick 317a8af ("policy: Keep deny
entries when covered by another CIDR deny") [from the v1.16 tree]. This
cherry-picked commit is built off of a2ff24f ("policy: Fix Deny
Insertion Logic"), which was found by bisecting v1.15 and v1.16 to
identify when this bug was "fixed" in the later versions of Cilium.

In addition, this commit contains another minimal cherry-pick from
317a8af ("policy: Keep deny entries when covered by another CIDR
deny") [from the v1.16 tree] for the "deny-allow" [3] and "allow-deny"
[4] case. [3] fixes a transient bug as explained by
#33953 (comment).

For more context on the changes to the policy engine between versions,
tracing the commit history from the commits mentioned above would be the
best way.

[1]:

```
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow
spec:
  egress:
  - toCIDR:
    - 1.1.1.1/32
    - 1.1.1.2/32
  endpointSelector: {}
---
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: deny
spec:
  egressDeny:
  - toCIDRSet:
    - cidr: 1.0.0.0/8
      except:
      - 1.1.1.1/32
  endpointSelector: {}
```

[2]: "Deny-deny" refers to the policy engine code path inside
     denyPreferredInsertWithChanges() under

```
if newEntry.IsDeny {
  ...
  ms.ForEachDeny()
  ...
}
```

[3]: Similar to [2], "deny-allow" refers to

```
if newEntry.IsDeny {
  ...
  ms.ForEachAllow()
  ...
}
```

[4]: Similar to [3], "allow-deny" refers to (note the ! in the
condition)

```
if !newEntry.IsDeny {
  ...
  ms.ForEachDeny()
  ...
}
```

Signed-off-by: Chris Tarazi <chris@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. kind/bug This is a bug in the Cilium logic. release-note/misc This PR makes changes that have no direct user impact. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants