Skip to content

Conversation

giorio94
Copy link
Member

This ensures that the underlying operations (e.g., kvstore calls) are correctly aborted when the context gets closed, most relevantly during shutdown. In turn, preventing blocking the shutdown of the operator until the grace period kicks in if the connection to the kvstore failed.

This ensures that the underlying operations (e.g., kvstore calls) are
correctly aborted when the context gets closed, most relevantly during
shutdown. In turn, preventing blocking the shutdown of the operator
until the grace period kicks in if the connection to the kvstore failed.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
@giorio94 giorio94 added release-note/misc This PR makes changes that have no direct user impact. area/kvstore Impacts the KVStore package interactions. labels Nov 19, 2024
@giorio94 giorio94 requested review from a team as code owners November 19, 2024 10:22
@giorio94
Copy link
Member Author

/test

Copy link
Contributor

@marseel marseel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, lgtm

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 19, 2024
@tklauser tklauser added this pull request to the merge queue Nov 19, 2024
Merged via the queue into cilium:main with commit 297a679 Nov 19, 2024
74 checks passed
@HadrienPatte
Copy link
Member

This apprear to solve an issue we are observing on 1.16: the operator sometimes panics on shutdown with this kind of trace:

trace
panic: limiter misuse: Allow / Wait / WaitN called concurrently after Stop
goroutine 499 [running]:
github.com/cilium/cilium/pkg/rate.(*Limiter).assertAlive(...)
/go/src/github.com/cilium/cilium/pkg/rate/limiter.go:68
github.com/cilium/cilium/pkg/rate.(*Limiter).WaitN(0xc00089ff00, {0x62a7cf0, 0xa0dc8e0}, 0x1)
/go/src/github.com/cilium/cilium/pkg/rate/limiter.go:100 +0x110
github.com/cilium/cilium/pkg/rate.(*Limiter).Wait(...)
/go/src/github.com/cilium/cilium/pkg/rate/limiter.go:91
github.com/cilium/cilium/pkg/kvstore/allocator.(*kvstoreBackend).RunGC(0xc00028ed90, {0x62a7cf0, 0xa0dc8e0}, 0xc00089ff00, 0xc009816420, 0x90000, 0x9ffff)
/go/src/github.com/cilium/cilium/pkg/kvstore/allocator/allocator.go:536 +0x14d1
github.com/cilium/cilium/pkg/allocator.(*Allocator).RunGC(0xc0002e5c00?, 0xc005024660?, 0xc007f8ff40?)
/go/src/github.com/cilium/cilium/pkg/allocator/allocator.go:864 +0x44
github.com/cilium/cilium/operator/identitygc.(*GC).runKVStoreModeGC(0xc0003c6620, {0x62a7e10, 0xc0007cb770})
/go/src/github.com/cilium/cilium/operator/identitygc/kvstore_gc.go:51 +0x17d
github.com/cilium/workerpool.(*WorkerPool).run.func1()
/go/src/github.com/cilium/cilium/vendor/github.com/cilium/workerpool/workerpool.go:181 +0x75
created by github.com/cilium/workerpool.(*WorkerPool).run in goroutine 498
/go/src/github.com/cilium/cilium/vendor/github.com/cilium/workerpool/workerpool.go:178 +0x5c

Would you consider backporting this change to 1.16 as a bugfix?

Also happy to open a separate issue if you prefer

@giorio94

@giorio94
Copy link
Member Author

Would you consider backporting this change to 1.16 as a bugfix?

Yeah, sounds reasonable to me. The change is trivial enough to make the backport risk very low.

That said, it would not properly fix the issue that you are observing, as that would require making sure that limiter.Stop is only called after that runKVStoreModeGC terminated, which is not the case right now.

@giorio94 giorio94 added the needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch label Feb 18, 2025
@jschwinger233 jschwinger233 mentioned this pull request Feb 19, 2025
3 tasks
@jschwinger233 jschwinger233 added backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. and removed needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Feb 19, 2025
@github-actions github-actions bot added backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. labels Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kvstore Impacts the KVStore package interactions. backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants