-
Notifications
You must be signed in to change notification settings - Fork 3.4k
v1.16 Backports 2025-06-17 #40093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
v1.16 Backports 2025-06-17 #40093
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[ upstream commit 2b24f33 ] If the ciliumbot has approved a PR from renovate, we can safely remove the other reviewers from the list. This will decrease the noise towards those reviewers. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 6d2000c ] Whenever a pool is deleted, the operator deletes all LB ranges of that pool and then strip off each address allocation related to the removed ranges. While doing so, it attempts to immediately reconcile each service that lost an address, trying to allocating another address from an alive range. The reason to do so instead of running a full reconciliation is to improve efficiency: since each k8s object describing a service that lost an allocation should be patched, a later change to reallocate another address would lead to an additional update immediately after. However, trying to immediately reconcile a service while deleting the ranges might lead to issues in the allocation. In the case of a multi-range IP pool, if we try to reconcile a service while deleting the first range, we might end up allocating an IP from a range that belongs to the same pool, leading to a further removal and reconciliation in the next iteration. Moreover, the internal LBIPAM manager status is not consistent during the loop: while the pool has already been deleted, the ranges from that pool are being deleting one at a time. Therefore, a reconciliation attempt in the middle of the ranges deletion, might allocate an IP from a range with a non-existent parent pool, leading to spuriuos warnings like the following: `Bad state detected, store contains lbRange for pool 'pool-a' but missing the pool` To solve this the commit breaks the current deleteRangeAllocations in two parts. In the first one, we loop over the ranges to remove each of them alongside any allocation for an address contained in the range. Each modified service is just collected and returned. Then, after deleting the removed pool from the LBIPAM internal status, we finally reconcile each modified service, trying to reallocate an address immediately to avoid the double update. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit f4b9b76 ] When handling a pool deletion event, the LBIPAM manager loops over the removed pool ranges to strip off each address allocation with an IP contained in one of the ranges. While doing that, it also updates the poolToRanges internal map removing the range it is iterating on from the slice keyed by the deleted pool name. Removing items from a slice while iterating on them with a for range loop is incorrect, since the length of the slice is evaluated just once at the loop start. In case of a pool with multiple ranges, this may lead to the following panic: ``` $ go test -trimpath -v ./operator/pkg/lbipam/... -run="TestPoolDelete" === RUN TestPoolDelete --- FAIL: TestPoolDelete (0.00s) panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x2db94b8] goroutine 98 [running]: testing.tRunner.func1.2({0x3203820, 0x61279f0}) testing/testing.go:1734 +0x21c testing.tRunner.func1() testing/testing.go:1737 +0x35e panic({0x3203820?, 0x61279f0?}) runtime/panic.go:792 +0x132 github.com/cilium/cilium/operator/pkg/lbipam.(*rangesStore).Delete(0xc0003b0bf8, 0x0) github.com/cilium/cilium/operator/pkg/lbipam/range_store.go:34 +0x1d8 github.com/cilium/cilium/operator/pkg/lbipam.(*LBIPAM).handlePoolDeleted(0xc0003b0b60, {0x3d66d08, 0xc000653540}, 0xc0006e6b00) github.com/cilium/cilium/operator/pkg/lbipam/lbipam.go:1763 +0x1df github.com/cilium/cilium/operator/pkg/lbipam.(*LBIPAM).poolOnDelete(0xc0003b0b60, {0x3d66d08, 0xc000653540}, 0xc0006e6b00) github.com/cilium/cilium/operator/pkg/lbipam/lbipam.go:345 +0xe7 github.com/cilium/cilium/operator/pkg/lbipam.(*LBIPAM).handlePoolEvent(0xc0003b0b60, {0x3d66d08?, 0xc000653540?}, {{0x38bfe05, 0x6}, {{0xc00051e6d8, 0x6}, {0x0, 0x0}}, 0xc0006e6b00, ...}) github.com/cilium/cilium/operator/pkg/lbipam/lbipam.go:262 +0x87 github.com/cilium/cilium/operator/pkg/lbipam.(*newFixture).DeletePool(0xc00008bef0, 0xc0006836c0, 0xc0006e6b00) github.com/cilium/cilium/operator/pkg/lbipam/lbipam_fixture_test.go:192 +0x1a5 github.com/cilium/cilium/operator/pkg/lbipam.TestPoolDelete(0xc0006836c0) github.com/cilium/cilium/operator/pkg/lbipam/lbipam_test.go:1814 +0x385 testing.tRunner(0xc0006836c0, 0x3a35298) testing/testing.go:1792 +0xf4 created by testing.(*T).Run in goroutine 1 testing/testing.go:1851 +0x413 FAIL github.com/cilium/cilium/operator/pkg/lbipam 0.031s FAIL ``` The commit extends the TestDisablePool unit test to have multiple ranges in the fixture pools and then fixes the issue shallow copying the ranges slice before starting the iteration. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit d56ae81 ] This step can be parallelized so that this step runs faster to remove these directories. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Jussi Maki <jussi@isovalent.com>
/test |
aanm
approved these changes
Jun 17, 2025
[ upstream commit 1a2faea ] If the command doesn't have the repository as part of the command, then gh will try to derive it from the .git directory. Since this directory doesn't exist, the command fails. Fixes: 2b24f33 (".github/workflows: remove reviewers if ciliumbot approved PR") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Jussi Maki <jussi@isovalent.com>
477f925
to
cec6b34
Compare
pippolo84
approved these changes
Jun 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
/test |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
backport/1.16
This PR represents a backport for Cilium 1.16.x of a PR that was merged to main.
kind/backports
This PR provides functionality previously merged into master.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Once this PR is merged, a GitHub action will update the labels of these PRs: