-
Notifications
You must be signed in to change notification settings - Fork 3.4k
v1.18 Backports 2025-08-11 #41078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.18 Backports 2025-08-11 #41078
Conversation
[ upstream commit 0c97c19 ] Previously the ingress annotation code for the external traffic policy was returning Cluster as a default. This would cause a reconciliation error when using the host network and prevent the service from being created. For the service to work in this mode, there should be no external traffic policy. To support this, the ingress service template has to be adjusted to check the host network flag before setting the policy Fixes: #34028 Signed-off-by: Rich Theobald <rich.theobald@gmail.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit d5b0ab2 ] To mimic the same method provided by [netip.Addr]. Additionally, let's reimplement the [AddrCluster.Less] on top of it. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit a36ae92 ] Let's make use of the newly introduced [AddrCluster.Compare] function, to ensure consistent ordering also when there are multiple backends associated with the same IP, but different ClusterID. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit f28cb90 ] Currently, the loadbalancer reconciler is affected by a bug causing the incorrect pruning of backends if associated with a non-zero ClusterID, because the ClusterID read from the map is ignored. Let's get this fixed by unifying the [BackendValue.{GetAddress,GetIPCluster]] methods, to ensure that all callers correctly account for the ClusterID. Additionally, let's also extend the unit tests to cover this use-case and prevent possible regressions. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit d8638b0 ] [ backporter's note: Precalculate the number of backends for all endpoints resources to make newBackends happy. ] Since the ports are the same for everything in the EndpointSlice just prepare the 'Ports' map once. Similarly for Backend as the data is the same for each address allocate it just once. To further reduce allocations pre-allocate the hash maps. Benchmark results with https://github.com/joamaki/cilium-lb-load-test: Current `main` (image tag deaee71): Creating 1000 Services and EndpointSlices (30 backends per service, 30000 in total) in namespace cni-load-test... Finished creating resources. Waiting 10 seconds to settle... Recording memory usage RSS (kB): 257724.0 (diff 55952.0) Reachable kB: 42516.21875 Allocations: 3875906 Allocated kB: 169957.71875 CPU time seconds: 1.5099999999999998 This commit: Creating 1000 Services and EndpointSlices (30 backends per service, 30000 in total) in namespace cni-load-test... Finished creating resources. Waiting 10 seconds to settle... Recording memory usage RSS (kB): 258784.0 (diff 50344.0) Reachable kB: 37336.3984375 Allocations: 3772102 Allocated kB: 162521.7578125 CPU time seconds: 1.4699999999999989 Roughly ~10% less memory allocated and reachable. Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 8e2f55d ] The 50ms minimum retry leads to very high load if there are many entries that do not fit into the LB BPF maps. Since the probability of a BPF syscall failing without maps being full is very unlikely, raise default to 1 second to avoid excessive load in situations where BPF maps overflow. Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 47eda94 ] The location of the extraEnv should be under the 'env:' section. Having it out of place, causes the yaml to be incorrectly formatted preventing the user from using this helm variable. Fixes: a39087e ("helm: add readinessProbe to clustermesh-apiserver") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 1278c7d ] This test open-coded the flags for connectivity test, which meant that the flags were inconsistent in a few key areas like not gathering sysdump on failure. Fix it by reusing the test flags from earlier in the workflow. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 5198d28 ] [ backporter's note: The signature of the setupIPSecSuitePrivileged is different among upstream and v1.18. Adjust them. ] This commit fixed some test name in the ipsec suite by adding the "TestPrivileged" prefix, to make sure they're executed and not skipped. Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
[ upstream commit 3829092 ] Detect and show if endpoint slice is enabled in the agent. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 5b0ae4d ] [ backporter's note: Wireguard code got some major change in the main recently, but v1.18 doesn't have the change. Fixed conflicts caused by this. ] Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 1c7b1d0 ] [ backporter's note: Fixed a minor import conflict. ] Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit e574c08 ] In some cases, the kernel version detection was not working. This commit fixes these bugs and add the tests for the missing cases. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 3a749b4 ] Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 1077bde ] [ backporter's note: cilium-cli/features/status.go doesn't exist in v1.18 branch ] Skip label value checks in checkLabels and checkLabelValues when the allowed values set is empty, allowing unrestricted values for those labels. Fixes: 6659745 ("metric: provide way to declare labels.") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit c697c7e ] One of these was for _GKE_ v1.15 which is ancient, but all of the other references to v1.15 are for Cilium and are also stale. Remove them. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
fe18dd9
to
f3f6752
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for backports of my PRs (including @cilium/github-sec owners paths) 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good for my commit
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My commits look good, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks, LGTM!
Sorry for butting in, but what does this mean for the v1.18.1 release? Will this fix still be included? Very much looking forward to having this fixed 🙂 |
I'd say it's at risk of not making the v1.18.1 release as we're about to cut that release, but it could be pulled into a future release. Given the conflicts it's probably best if @joamaki as author prepares the backport. |
@joestringer fair enough, thanks for clarifying |
Backport for topology-aware fix in #41116 |
Just saw that the fix made it into the release pr. 🎉 Thanks for your efforts! |
This commit adds the TestPrivileged prefix to the node and ipsec linux tests. In main this is not needed as it is already running from within a TestPrivileged suite. While PRs related to similar fixes being backported from main #41078 and #41267 do sync up some tests, in v1.18 there were others that needed the TestPrivileged prefix to be added. Here's the fix. Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
This commit adds the TestPrivileged prefix to the node and ipsec linux tests. In main this is not needed as it is already running from within a TestPrivileged suite. While PRs related to similar fixes being backported from main #41078 and #41267 do sync up some tests, in v1.18 there were others that needed the TestPrivileged prefix to be added. Here's the fix. The last remaining bits will be the backport of #41279, which have not been added to this PR for consistency with history. Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
This commit adds the TestPrivileged prefix to the node and ipsec linux tests. In main this is not needed as it is already running from within a TestPrivileged suite. While PRs related to similar fixes being backported from main #41078 and #41267 do sync up some tests, in v1.18 there were others that needed the TestPrivileged prefix to be added. Here's the fix. The last remaining bits will be the backport of #41279, which have not been added to this PR for consistency with history. Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
PRs skipped due to conflicts:
Once this PR is merged, a GitHub action will update the labels of these PRs: