Skip to content

Conversation

jrajahalme
Copy link
Member

Once this PR is merged, a GitHub action will update the labels of these PRs:

 35150 35534

[ upstream commit 69a5d95 ]

Consider following two policy map state entries:

1. Allow <ID>/TCP/80/15 -- L3 <ID> specified, TCP port 80 partially wildcarded (ports 80-81)
2. Deny */TCP/80/16 -- L3 wildcarded, TCP port 80 not wildcarded

When the deny entry is added, Cilium adds an additional deny entry:

3. Deny <ID>/TCP/80/16 -- L3 <ID> specified, TCP port 80 not wildcarded

This makes sure that the allow rule is not given precedence. This is
currently needed due to the logic datapath uses to select between the
L3-specific and L4-only entries when both are found.

There is no such additional entry logic in the allow/allow case, so we
rely on the datapath choosing the "right" entry. Eg., if the more
specific L4-only policy entry has L7 policy, and the less specific L3
policy not, then the datapath should select the more specific entry and
redirect the traffic to the L7 proxy for policy enforcement.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
[ upstream commit ebb4c47 ]

Policy map key can have nexthdr and destination port fields
wildcarded. Cilium datapath typically performs two policy map lookups,
one with the L3 identity mapped from the relevant IP address (destination
or source IP for egress or ingress policy, respectively). Datapath
chooses between the two policy entries based on two booleans in the
policy entry: 'wildcarded_protocol' and 'wildcarded_dport'. The problem
is that 'wildcarded_dport' is set only if the destination port is fully
wildcarded, so partially wildcarded port is considered as "not
wildcarded" for this logic.

Fix this for partially wildcarded ports by always choosing between
L4-only and L3 policy matches by the key prefix length so that the most
specific entry is always selected, selecting the entry with a precific
identity if the prefix lengths are the same. That is, choose the policy
with specific L3 if the prefixes are the same length, otherwise choose
the policy with the longer prefix.

LPM map key already contains the prefix length, but unfortunately LPM map
lookup does not return the map key. This is why we store an 8-bit version
of the prefix length in the policy map entry, which is as follows:

0: both nexthdr and destination port fully wildcarded
8: nexthdr specified but port fully wildcarded
24: both nexthdr and destination port specified (no wildcarding)

Values between 8 and 24 represent the less or more wildcarded destination port.

Previously used wildcarded_protocol and wildcarded_dport flags are removed as redundant.

Before this change the precedence between the following two rules:

1. <ID>/TCP/80/15 -- L3 <ID> specified, TCP port 80 partially wildcarded (ports 80-81)
2. */TCP/80/16 -- L3 wildcarded, TCP port 80 not wildcarded

would have resulted in selecting the policy with <ID> specified even when
it's destination port is partially wildcarded, while the L4-only rule has
no wildcarding in it's destination port.

If the entries above were allow (1) and deny (3), Cilium agent masked
this behavior by adding an additional deny entry in this case:

3. <ID>/TCP/80/16 -- L3 <ID> specified, TCP port 80 not wildcarded

This masks the incorrect behavior for allow/deny case, since the
non-wildcarded key (3) is given precedence by the datapath.

There is no such additional entry logic in the allow/allow case, so if
the more specific L4-only policy entry has L7 policy, and the less
specific L3 policy not, then matching traffic would not be sent to the L7
proxy for policy enforcement.

After this change the policy with less L4 wildcarding (the more specific
L4) is given precedence, with L3 rule being selected when prefixes are of
the same length. This allows for simplifying policy map computation by
avoiding the need to add additional entries in cases where deny already
has the longer prefix length.

This change makes no difference for keys which only have the destination
port fully specified or fully wildcarded, so this only affects policies
with port ranges as enabled in Cilium 1.16.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
[ upstream commit 60bc8fa ]

Move the prefix length field in the policy map value to the flags (same
byte as 'deny') to a previously unused space. This cleans up padding
space for future use.

Skip map entries with inconsistent prefix length field in value vs. key
so that the entry will be rewritten with the correct prefix length field
in the value. This fixes a potential issue in upgrade where a policy map
entry may remain with zero valued prefix length in value leading to
incorrect policy enforcement.

Fixes: cilium#35150

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
@jrajahalme jrajahalme added the kind/backports This PR provides functionality previously merged into master. label Oct 29, 2024
@jrajahalme jrajahalme requested a review from a team as a code owner October 29, 2024 08:23
@maintainer-s-little-helper maintainer-s-little-helper bot added the backport/1.16 This PR represents a backport for Cilium 1.16.x of a PR that was merged to main. label Oct 29, 2024
@jrajahalme
Copy link
Member Author

/test-backport-1.16

@jrajahalme jrajahalme added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Oct 29, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 30, 2024
@jrajahalme jrajahalme added this pull request to the merge queue Oct 30, 2024
Merged via the queue into cilium:v1.16 with commit 9055d28 Oct 30, 2024
64 checks passed
@jrajahalme jrajahalme deleted the pr/jrajahalme/bpf-policy-map-precedence-fix-v1.16 branch October 30, 2024 14:31
@ferozsalam ferozsalam added affects/v1.16 This issue affects v1.16 branch and removed affects/v1.16 This issue affects v1.16 branch labels Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.16 This PR represents a backport for Cilium 1.16.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants