Skip to content

Conversation

rastislavs
Copy link
Contributor

@rastislavs rastislavs commented Jan 21, 2025

Once this PR is merged, a GitHub action will update the labels of these PRs:

 36788 35718 36598 36827 36858 36861 36856 36878 36727 36852 36859 36905 36913 36921 36769 36899 36917 36924 36892 36364 36932 36935 36953 36966 36819 36929 36371 36433 36982 36853 36649 36602 37008 36798 37005 37054 36980 37048 36823 37042 37081 37053 37059 36875 37086

gentoo-root and others added 30 commits January 21, 2025 12:55
[ upstream commit cdecbcb ]

GNU make on the host may use --jobserver-style=fifo (default on my
machine). It also implies --jobserver-auth=fifo:/tmp/GMfifo$MAKE_PID,
an undocumented flag, used internally by make and passed to the child
instances of make. This flag appears in $(MAKEFLAGS).

The cilium-build target in Documentation/Makefile passes MAKEFLAGS to
another make instance, called in a docker image. The problem is that
--jobserver-auth passed to make inside docker points to a file that
doesn't exist in the container filesystem namespace, and make fails with
an error like this:

make: *** internal error: invalid --jobserver-auth string 'fifo:/tmp/GMfifo361142'.  Stop.
make: *** [Makefile:48: cilium-build] Error 2
make: Leaving directory '/home/max/.opt/go/src/github.com/cilium/cilium-snat/Documentation'

Fix this by filtering out --jobserver-auth=... from MAKEFLAGS when
passing it to make inside docker.

Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 5a4fdc9 ]

Fixes: #34754

Currently, the cilium-operator xRoute reconcilers reconcile all
(non-GAMMA) xRoutes regardless of their parentage. This can lead to
race conditions where the Cilium operator can clobber metadata updates
made by other operatores (ie, the Kong Kubernetes Ingress Controller).
This PR adjusts the xRoute reconcilers to skip routes associated with the Cilium gateway.

Signed-off-by: Andrew E. Timmes <atimmes@seatgeek.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit ed4772f ]

In preparation of the subsequent commits, rework the errors usage in
rules sanitization. Specifically:

- fix some linting errors (capitalized error strings, punctuation in
  error strings, useless usage of error formatting and so on)
- use errors.Join to propagate errUnsupportedICMPWithToPorts alongside
  other possible validation errors
- fix tests to rely on require.ErrorIs when checking for a specific
  error

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 947b8d3 ]

Extract IngressCommonRule sanitization into its own method. This will be
useful in a subsequent commit, where the sanitization for IngressDeny
rules will be added. Since IngressCommonRule is embedded into both
Ingress and IngressDeny rules, the new method can be used while
sanitizing both rule types.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit bd170ba ]

Extract EgressCommonRule sanitization into its own method. This will be
useful in a subsequent commit, where the sanitization for EgressDeny
rules will be added. Since EgressCommonRule is embedded into both Egress
and EgressDeny rules, the new method can be used while sanitizing both
rule types.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 557498b ]

Add sanitization for IngressDeny rules. The sanitization applies the
same logic used for Ingress rules, dropping the parts that are not
relevant for the IngressDeny ones (e.g: IngressDeny has no L7 rules).

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 91cd89e ]

Add sanitization for EgressDeny rules. The sanitization applies the same
logic used for Egress rules, dropping the parts that are not relevant
for the EgressDeny ones (e.g: EgressDeny has no L7 rules).

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 0e97e01 ]

Add helper methods to {Egress,EgressDeny,EgressCommon}Rule to get the
list of L3 members and the list of L3 members that are not compatible
with ToPorts rules.

This simplifies the extension and reuse of the sanitization logic in
EgressCommonRule to EgressRule and EgressDenyRule.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 1fa46b2 ]

Extend the current unit tests to take into account the sanitization of
both IngressDeny and EgressDeny rules.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 04bd9a1 ]

This was missed in #36534

Signed-off-by: Tam Mach <tam.mach@cilium.io>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 963b5b2 ]

439f558 ("CLI: cilium upgrade preserver previous configurations")
changed the default behavior of `cilium upgrade` to preserve the
previous configuration upon upgrade. While this is the expected behavior
in most situations, it turned out being problematic in the ipsec upgrade
downgrade tests (specifically in case of patch matrix entries). Indeed,
the images are not overwritten for the stable configuration, leading in
turn to not actually performing the downgrade. Let's fix this by explicitly
resetting the values when performing the upgrade and downgrade operations.

Reported-by: Jarno Rajahalme <jarno@isovalent.com>
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit ab10262 ]

When *any* routing mode is supported, it makes no sense to list this as
requirement.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 19909d0 ]

I believe this dependency originated from the need for
`HaveLargeInstructionLimit`. As we now require a 5.4-equivalent kernel
there is no more need to point out this requirement.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit d2c4e0f ]

This is redundant, we now require a 5.4-equivalent kernel.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit f5f831f ]

If KPR is false, setting these to true does not make any sense. Thus, remove
this to avoid confusion.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 5dca7ad ]

hostPort.enabled=true under KPR off is not supported given the former
relies on the latter's infrastructure. Remove this from the docs to
avoid confusion.

Before 1a4e7d4 ("docs: Adjust KPR GSG to --kpr=boolean changes")
this sentence was meant in the context of KPR=partial.

Fixes: 1a4e7d4 ("docs: Adjust KPR GSG to --kpr=boolean changes")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 808b4cf ]

Be consistent with the other features that are described on this page,
and provide a snippet with the needed cilium config.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 3fd2281 ]

Service events are not used when CNPs and CCNPs are disabled. However,
we currently subscribe to the service cache, and enqueue all events in
a local cache, in turn causing a memory leak in that case given that
the events would never be dequeued. Let's get this fixed.

Fixes: 253a05b ("policy: Add config for enabling Cilium NetworkPolicy")
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 34e296e ]

Talos Linux's 'Forwarding kube-dns to Host DNS' functionality doesn't
work together with Cilium's BPF host routing.

Signed-off-by: Philip Schmid <phisch@cisco.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 3fe873c ]

The conformance external workloads workflow recently started failing
consistently on the v1.15 branch with the following error:

> msg="Failed to install iptables rules" error="cannot install static proxy rules:
    unable to run 'ip6tables -t mangle -A CILIUM_PRE_mangle -m socket --transparent
    -m mark ! --mark 0x00000e00/0x00000f00 -m mark ! --mark 0x00000800/0x00000f00
    -m comment --comment cilium: any->pod redirect proxied traffic to host proxy
    -j MARK --set-mark 0x00000200' iptables command: exit status 1 stderr=\"Warning:
    Extension MARK revision 0 not supported, missing kernel module?\\nip6tables:
    No chain/target/match by that name.\\n\"" subsys=iptables

This seems to be the consequence of a regression backported by Canonical
into linux-image-5.15.0-127-generic. Let's circumvent this problem by
upgrading the selected image family, which was now largely out of date.

Related: https://bugs.launchpad.net/ubuntu/+source/linux-meta-oracle-5.15/+bug/2091960
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit c6f84b2 ]

Relates: 9991bfc
Signed-off-by: Tam Mach <tam.mach@cilium.io>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 4638384 ]

When the images are built from stable branches we should also publish
floating images from those branches.

Fixes: 5ce3112 (".github: simplify build-images-ci workflow")
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit dc5037b ]

Documented the known limitation of Cilium's eBPF host routing
when relying on netfilter hooks at the same time.

Signed-off-by: Philip Schmid <phisch@cisco.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 900cb76 ]

This limitation can be removed after #35143.

Relates: #35143
Relates: #24318
Signed-off-by: Tam Mach <tam.mach@cilium.io>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 684a62e ]

The new version of the multicluster plugins needs coredns 1.11.4, so this is
updating the corresponding MCS-API docs on how to recompile CoreDNS to
reflect that.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 757a56c ]

Fixes: cdecbcb ("build: Filter out --jobserver-auth from MAKEFLAGS")
Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit fceab4f ]

The labels configuration for the agent is quite important to how Cilium
respects the labels of workloads in order to properly enforce network
policy. Through recent review a few ambiguities were pointed out to me,
so this commit attempts to clarify those points.

Thanks to Filip Nikolic and André Martins for the detailed discussion
that motivated this wording change.

Signed-off-by: Joe Stringer <joe@cilium.io>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit e53c222 ]

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit c835ca8 ]

The check-ipsec-leak.bt script reports errors for plain-text packets
(non-ESP only, this is not used for Wireguard) captured from bridge when:

1. the packets have pod IP as both src and dst
2. or the packets have either src or dst IP inside pod CIDR, plus the
   packets are from proxy ($trace_override)

Let's focus on 2.

In the normal case of pod->world via L7, proxy emits packets like pod->world,
but masquerading (nf/bpf) snats the packets to node->world, therefore
the script never sees pod IP on bridge, no errors are reported.

In the case of egress gateway via L7, the packet is first forwarded to the
gateway node after L7 policy check with the pod->world notation, and the
gateway node later forwards the request using to gateway->world.
Here the script sees that 1st packet with src pod IP && from_proxy,
triggering the (false) alarm.

This PR is to make sure that these type of packets won't be tracked as
from_proxy ($trace_override). We currently avoid this "false" alarm by
running egress-gateway connectivity tests separately from all the others
and with no leak-detection script running.
As we're going to introduce leak detection in more and more CI tests
hopefully, let's add this check to skip tracking unwanted traffic.

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit a6ea748 ]

In the previous commit, we introduced to the bpftrace the capability to
skip tracking traffic with destination address outside pod CIDRs.
At this point, in conformance-ipsec-e2e tests we do not need to split tests
anymore (egress-gateway vs all the others), previously modified in #35323,
"workflows/ipsec: Cover egress gateway", ea6bdac.

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
squeed and others added 3 commits January 21, 2025 13:48
[ upstream commit e48dda2 ]

Cleans up a few things:
- Documents the ability to select CCG by labels
- Mentions that hubble flows are now labeled
- Links from main policy page.

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 45d638a ]

This commit pass down the targetPort in a MCS-API derived Service from the
local Service. We only need targetPort for local Endpoints so we do not
need to globally sync this as ports are synced from endpoints on remote
clusters that already account for targetPort. Thus we just take the ports merged
globally from the ServiceImport and add the context of the targetPorts if
we need to on the local cluster.

This also clarifies the EndpointSlice sync on target port as the targetPort
field  is now available into the slim Service copy done by Cilium be we
shouldn't actually use it as targetPort may be a string and we do not
have this info from remote clusters.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 6d90e39 ]

Do not explicitly call RunMetadataResolver in case the pod namespace
label has not been retrieved yet (e.g., the pod informer has not yet
received the creation event for the pod), given that the same operation
is always guaranteed to be started immediately afterwards as well.
Moreover, this duplicate call can eventually cause the endpoint
regeneration state machine to go out of sync, leaving the endpoint
stuck in "waiting-to-regenerate" state forever.

Specifically, the above problem occurs due to the following sequence
of events:

1. A new pod gets created and scheduled onto a node, in turn eventually
   leading to an endpoint creation API request.
2. When the request is processed, the pod informer has not yet received
   the pod creation event, hence causing the pod retrieval to fail.
3. At this point, the set of identity labels only contains the init one
   (because we couldn't retrieve the pod labels), hence triggering the
   first call to ep.RunMetadataResolver. In turn, this starts a controller
   in the background (not blocking, as the second parameter is false),
   which attempts to retrieve the labels again, sets the identity (using
   the init one in case of retrieval failure) and triggers the endpoint
   regeneration. The endpoint first transitions into waiting-to-regenerate
   state); eventually, the regeneration starts to be processed, and the
   endpoint moves into regenerating state.
4. ep.RunMetadataResolver is called a second time; this step would
   normally be blocking (the second parameter is true), but in case
   the labels are already correct due to the previous call to the same
   function, no regeneration would be required, eventually returning
   regenTriggered = false;
5. Given that regenTriggered is false, the regeneration is triggered
   again immediately afterwards [1], transitioning the endpoint into
   waiting-to-regenerate state again. It is important to note that in
   this case we propagate the endpoint creation context as part of the
   regeneration event, rather than using the endpoint context. This
   appears correct, but ends up being problematic in this specific case.
6. In background, the initial regeneration triggered in 3. completes
   correctly. The endpoint is attempted to be transitioned into ready
   state, but that's skipped (correctly) because another regeneration
   has been scheduled subsequently (in 5.). At the same time,
   WaitForFirstRegeneration [2] unblocks (given that the regeneration
   completed), letting the endpoint creation API request to terminate.
   This, in turn, causes the associated context to be canceled.
7. At this point, there's still the endpoint regeneration triggered in
   5. that needs to be executed; however, depending on the timing, the
   handling may be aborted in [3], given that the context has just been
   canceled; while this would normally indicate that the endpoint is
   terminating, hence not causing any problems, in this case it leaves
   the endpoint stuck into the waiting-to-regenerate state, from which
   it will never recover (as all subsequent regeneration triggers would
   be treated as duplicates).

Dropping the duplicate call to ep.RunMetadataResolver prevents this
problem, as it ensures that we don't unnecessarily trigger a second
regeneration associated with the API context. The fallback regeneration
should actually never required when operating in Kubernetes mode,
because ep.RunMetadataResolver will always cause a new identity to
be assigned to the endpoint, in turn triggering regeneration.

[1]: https://github.com/cilium/cilium/blob/a9bcb0cb3858749f584c48cf5922f060c5f5871e/daemon/cmd/endpoint.go#L569-L582
[2]: https://github.com/isovalent/cilium/blob/a9bcb0cb3858749f584c48cf5922f060c5f5871e/daemon/cmd/endpoint.go#L573-L575
[3]: https://github.com/cilium/cilium/blob/a9bcb0cb3858749f584c48cf5922f060c5f5871e/pkg/endpoint/events.go#L64-L71

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
@rastislavs rastislavs force-pushed the pr/v1.17-backport-2025-01-21-12-55 branch from 56dd121 to 5cbdc10 Compare January 21, 2025 12:53
@rastislavs rastislavs temporarily deployed to release-base-images January 21, 2025 12:53 — with GitHub Actions Inactive
Copy link
Member

@MrFreezeex MrFreezeex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 🙏! all my changes looks good!

@rastislavs rastislavs temporarily deployed to release-base-images January 21, 2025 13:01 — with GitHub Actions Inactive
@rastislavs
Copy link
Contributor Author

/test

@rastislavs rastislavs marked this pull request as ready for review January 21, 2025 13:05
@rastislavs rastislavs requested review from a team as code owners January 21, 2025 13:05
Copy link
Contributor

@smagnani96 smagnani96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks, my commits looks good 👍

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My commits look good, thanks!

@nathanjsweet
Copy link
Member

Thanks!

Copy link
Member

@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for my PR, thanks! 💯

@rastislavs rastislavs temporarily deployed to release-base-images January 21, 2025 16:41 — with GitHub Actions Inactive
Copy link
Contributor

@rectified95 rectified95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for my change

@michi-covalent michi-covalent added this pull request to the merge queue Jan 22, 2025
Merged via the queue into v1.17 with commit 55a45cb Jan 22, 2025
310 checks passed
@michi-covalent michi-covalent deleted the pr/v1.17-backport-2025-01-21-12-55 branch January 22, 2025 06:09
@pippolo84 pippolo84 mentioned this pull request Jun 26, 2025
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.17 This PR represents a backport for Cilium 1.17.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.