-
Notifications
You must be signed in to change notification settings - Fork 3.4k
v1.17 Backports 2025-01-21 #37126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.17 Backports 2025-01-21 #37126
Conversation
[ upstream commit cdecbcb ] GNU make on the host may use --jobserver-style=fifo (default on my machine). It also implies --jobserver-auth=fifo:/tmp/GMfifo$MAKE_PID, an undocumented flag, used internally by make and passed to the child instances of make. This flag appears in $(MAKEFLAGS). The cilium-build target in Documentation/Makefile passes MAKEFLAGS to another make instance, called in a docker image. The problem is that --jobserver-auth passed to make inside docker points to a file that doesn't exist in the container filesystem namespace, and make fails with an error like this: make: *** internal error: invalid --jobserver-auth string 'fifo:/tmp/GMfifo361142'. Stop. make: *** [Makefile:48: cilium-build] Error 2 make: Leaving directory '/home/max/.opt/go/src/github.com/cilium/cilium-snat/Documentation' Fix this by filtering out --jobserver-auth=... from MAKEFLAGS when passing it to make inside docker. Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 5a4fdc9 ] Fixes: #34754 Currently, the cilium-operator xRoute reconcilers reconcile all (non-GAMMA) xRoutes regardless of their parentage. This can lead to race conditions where the Cilium operator can clobber metadata updates made by other operatores (ie, the Kong Kubernetes Ingress Controller). This PR adjusts the xRoute reconcilers to skip routes associated with the Cilium gateway. Signed-off-by: Andrew E. Timmes <atimmes@seatgeek.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit ed4772f ] In preparation of the subsequent commits, rework the errors usage in rules sanitization. Specifically: - fix some linting errors (capitalized error strings, punctuation in error strings, useless usage of error formatting and so on) - use errors.Join to propagate errUnsupportedICMPWithToPorts alongside other possible validation errors - fix tests to rely on require.ErrorIs when checking for a specific error Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 947b8d3 ] Extract IngressCommonRule sanitization into its own method. This will be useful in a subsequent commit, where the sanitization for IngressDeny rules will be added. Since IngressCommonRule is embedded into both Ingress and IngressDeny rules, the new method can be used while sanitizing both rule types. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit bd170ba ] Extract EgressCommonRule sanitization into its own method. This will be useful in a subsequent commit, where the sanitization for EgressDeny rules will be added. Since EgressCommonRule is embedded into both Egress and EgressDeny rules, the new method can be used while sanitizing both rule types. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 557498b ] Add sanitization for IngressDeny rules. The sanitization applies the same logic used for Ingress rules, dropping the parts that are not relevant for the IngressDeny ones (e.g: IngressDeny has no L7 rules). Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 91cd89e ] Add sanitization for EgressDeny rules. The sanitization applies the same logic used for Egress rules, dropping the parts that are not relevant for the EgressDeny ones (e.g: EgressDeny has no L7 rules). Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 0e97e01 ] Add helper methods to {Egress,EgressDeny,EgressCommon}Rule to get the list of L3 members and the list of L3 members that are not compatible with ToPorts rules. This simplifies the extension and reuse of the sanitization logic in EgressCommonRule to EgressRule and EgressDenyRule. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 1fa46b2 ] Extend the current unit tests to take into account the sanitization of both IngressDeny and EgressDeny rules. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 963b5b2 ] 439f558 ("CLI: cilium upgrade preserver previous configurations") changed the default behavior of `cilium upgrade` to preserve the previous configuration upon upgrade. While this is the expected behavior in most situations, it turned out being problematic in the ipsec upgrade downgrade tests (specifically in case of patch matrix entries). Indeed, the images are not overwritten for the stable configuration, leading in turn to not actually performing the downgrade. Let's fix this by explicitly resetting the values when performing the upgrade and downgrade operations. Reported-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit ab10262 ] When *any* routing mode is supported, it makes no sense to list this as requirement. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 19909d0 ] I believe this dependency originated from the need for `HaveLargeInstructionLimit`. As we now require a 5.4-equivalent kernel there is no more need to point out this requirement. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit d2c4e0f ] This is redundant, we now require a 5.4-equivalent kernel. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit f5f831f ] If KPR is false, setting these to true does not make any sense. Thus, remove this to avoid confusion. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 5dca7ad ] hostPort.enabled=true under KPR off is not supported given the former relies on the latter's infrastructure. Remove this from the docs to avoid confusion. Before 1a4e7d4 ("docs: Adjust KPR GSG to --kpr=boolean changes") this sentence was meant in the context of KPR=partial. Fixes: 1a4e7d4 ("docs: Adjust KPR GSG to --kpr=boolean changes") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 808b4cf ] Be consistent with the other features that are described on this page, and provide a snippet with the needed cilium config. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 3fd2281 ] Service events are not used when CNPs and CCNPs are disabled. However, we currently subscribe to the service cache, and enqueue all events in a local cache, in turn causing a memory leak in that case given that the events would never be dequeued. Let's get this fixed. Fixes: 253a05b ("policy: Add config for enabling Cilium NetworkPolicy") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 34e296e ] Talos Linux's 'Forwarding kube-dns to Host DNS' functionality doesn't work together with Cilium's BPF host routing. Signed-off-by: Philip Schmid <phisch@cisco.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 3fe873c ] The conformance external workloads workflow recently started failing consistently on the v1.15 branch with the following error: > msg="Failed to install iptables rules" error="cannot install static proxy rules: unable to run 'ip6tables -t mangle -A CILIUM_PRE_mangle -m socket --transparent -m mark ! --mark 0x00000e00/0x00000f00 -m mark ! --mark 0x00000800/0x00000f00 -m comment --comment cilium: any->pod redirect proxied traffic to host proxy -j MARK --set-mark 0x00000200' iptables command: exit status 1 stderr=\"Warning: Extension MARK revision 0 not supported, missing kernel module?\\nip6tables: No chain/target/match by that name.\\n\"" subsys=iptables This seems to be the consequence of a regression backported by Canonical into linux-image-5.15.0-127-generic. Let's circumvent this problem by upgrading the selected image family, which was now largely out of date. Related: https://bugs.launchpad.net/ubuntu/+source/linux-meta-oracle-5.15/+bug/2091960 Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 4638384 ] When the images are built from stable branches we should also publish floating images from those branches. Fixes: 5ce3112 (".github: simplify build-images-ci workflow") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit dc5037b ] Documented the known limitation of Cilium's eBPF host routing when relying on netfilter hooks at the same time. Signed-off-by: Philip Schmid <phisch@cisco.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 684a62e ] The new version of the multicluster plugins needs coredns 1.11.4, so this is updating the corresponding MCS-API docs on how to recompile CoreDNS to reflect that. Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit fceab4f ] The labels configuration for the agent is quite important to how Cilium respects the labels of workloads in order to properly enforce network policy. Through recent review a few ambiguities were pointed out to me, so this commit attempts to clarify those points. Thanks to Filip Nikolic and André Martins for the detailed discussion that motivated this wording change. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit e53c222 ] Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit c835ca8 ] The check-ipsec-leak.bt script reports errors for plain-text packets (non-ESP only, this is not used for Wireguard) captured from bridge when: 1. the packets have pod IP as both src and dst 2. or the packets have either src or dst IP inside pod CIDR, plus the packets are from proxy ($trace_override) Let's focus on 2. In the normal case of pod->world via L7, proxy emits packets like pod->world, but masquerading (nf/bpf) snats the packets to node->world, therefore the script never sees pod IP on bridge, no errors are reported. In the case of egress gateway via L7, the packet is first forwarded to the gateway node after L7 policy check with the pod->world notation, and the gateway node later forwards the request using to gateway->world. Here the script sees that 1st packet with src pod IP && from_proxy, triggering the (false) alarm. This PR is to make sure that these type of packets won't be tracked as from_proxy ($trace_override). We currently avoid this "false" alarm by running egress-gateway connectivity tests separately from all the others and with no leak-detection script running. As we're going to introduce leak detection in more and more CI tests hopefully, let's add this check to skip tracking unwanted traffic. Signed-off-by: Simone Magnani <simone.magnani@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit a6ea748 ] In the previous commit, we introduced to the bpftrace the capability to skip tracking traffic with destination address outside pod CIDRs. At this point, in conformance-ipsec-e2e tests we do not need to split tests anymore (egress-gateway vs all the others), previously modified in #35323, "workflows/ipsec: Cover egress gateway", ea6bdac. Signed-off-by: Simone Magnani <simone.magnani@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit e48dda2 ] Cleans up a few things: - Documents the ability to select CCG by labels - Mentions that hubble flows are now labeled - Links from main policy page. Signed-off-by: Casey Callendrello <cdc@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 45d638a ] This commit pass down the targetPort in a MCS-API derived Service from the local Service. We only need targetPort for local Endpoints so we do not need to globally sync this as ports are synced from endpoints on remote clusters that already account for targetPort. Thus we just take the ports merged globally from the ServiceImport and add the context of the targetPorts if we need to on the local cluster. This also clarifies the EndpointSlice sync on target port as the targetPort field is now available into the slim Service copy done by Cilium be we shouldn't actually use it as targetPort may be a string and we do not have this info from remote clusters. Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit 6d90e39 ] Do not explicitly call RunMetadataResolver in case the pod namespace label has not been retrieved yet (e.g., the pod informer has not yet received the creation event for the pod), given that the same operation is always guaranteed to be started immediately afterwards as well. Moreover, this duplicate call can eventually cause the endpoint regeneration state machine to go out of sync, leaving the endpoint stuck in "waiting-to-regenerate" state forever. Specifically, the above problem occurs due to the following sequence of events: 1. A new pod gets created and scheduled onto a node, in turn eventually leading to an endpoint creation API request. 2. When the request is processed, the pod informer has not yet received the pod creation event, hence causing the pod retrieval to fail. 3. At this point, the set of identity labels only contains the init one (because we couldn't retrieve the pod labels), hence triggering the first call to ep.RunMetadataResolver. In turn, this starts a controller in the background (not blocking, as the second parameter is false), which attempts to retrieve the labels again, sets the identity (using the init one in case of retrieval failure) and triggers the endpoint regeneration. The endpoint first transitions into waiting-to-regenerate state); eventually, the regeneration starts to be processed, and the endpoint moves into regenerating state. 4. ep.RunMetadataResolver is called a second time; this step would normally be blocking (the second parameter is true), but in case the labels are already correct due to the previous call to the same function, no regeneration would be required, eventually returning regenTriggered = false; 5. Given that regenTriggered is false, the regeneration is triggered again immediately afterwards [1], transitioning the endpoint into waiting-to-regenerate state again. It is important to note that in this case we propagate the endpoint creation context as part of the regeneration event, rather than using the endpoint context. This appears correct, but ends up being problematic in this specific case. 6. In background, the initial regeneration triggered in 3. completes correctly. The endpoint is attempted to be transitioned into ready state, but that's skipped (correctly) because another regeneration has been scheduled subsequently (in 5.). At the same time, WaitForFirstRegeneration [2] unblocks (given that the regeneration completed), letting the endpoint creation API request to terminate. This, in turn, causes the associated context to be canceled. 7. At this point, there's still the endpoint regeneration triggered in 5. that needs to be executed; however, depending on the timing, the handling may be aborted in [3], given that the context has just been canceled; while this would normally indicate that the endpoint is terminating, hence not causing any problems, in this case it leaves the endpoint stuck into the waiting-to-regenerate state, from which it will never recover (as all subsequent regeneration triggers would be treated as duplicates). Dropping the duplicate call to ep.RunMetadataResolver prevents this problem, as it ensures that we don't unnecessarily trigger a second regeneration associated with the API context. The fallback regeneration should actually never required when operating in Kubernetes mode, because ep.RunMetadataResolver will always cause a new identity to be assigned to the endpoint, in turn triggering regeneration. [1]: https://github.com/cilium/cilium/blob/a9bcb0cb3858749f584c48cf5922f060c5f5871e/daemon/cmd/endpoint.go#L569-L582 [2]: https://github.com/isovalent/cilium/blob/a9bcb0cb3858749f584c48cf5922f060c5f5871e/daemon/cmd/endpoint.go#L573-L575 [3]: https://github.com/cilium/cilium/blob/a9bcb0cb3858749f584c48cf5922f060c5f5871e/pkg/endpoint/events.go#L64-L71 Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
56dd121
to
5cbdc10
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 🙏! all my changes looks good!
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks, my commits looks good 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My commits look good, thanks!
Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good for my PR, thanks! 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good for my change
Once this PR is merged, a GitHub action will update the labels of these PRs: