-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Endpoint leave locking 1.15 #35639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jrajahalme
merged 7 commits into
cilium:v1.15
from
jrajahalme:endpoint-leave-locking-1.15
Oct 31, 2024
Merged
Endpoint leave locking 1.15 #35639
jrajahalme
merged 7 commits into
cilium:v1.15
from
jrajahalme:endpoint-leave-locking-1.15
Oct 31, 2024
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[ upstream commit da05633 ] Currently on shutdown we don't remove the health endpoint from the endpoint manager but do remove the corresponding netlink interface. Note that validateEndpoint() special cases the health endpoint and does remove the old state directory if it existed before. But independent of that it would be better to just streamline cleanup via cleanupHealthEndpoint() for all cases. Related: cilium#32689 Signed-off-by: Robin Gögge <r.goegge@isovalent.com> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 09633b8 ] We log whenever we start the health endpoint. Also add a log entry whenever we attempt to clean up. Also add a small comment in the cleanup routine for device removal. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit f7f455d ] Do not open code it in the health code. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit a90b980 ] Otherwise we might run risk of hitting the timeout too early. Just reping right away upon success. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 60e5080 ] For the health endpoint the agent needs to remove devices since its not managed by CNI. Add a comment explaining this. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
…nish [ upstream commit f59fe4e ] Endpoint deletions can race with pending builds as detailed in cilium#32689. This is due to the fact that the Delete() call does not wait for them to finish. Instead, it tears down the endpoint, downs the device, removes tc(x), etc. This is not correct since the inflight from regenerate() can later still attempt to attach something to the given device. This in itself is not problematic as the CNI will just remove the device and the kernel removes all objects along with it. However, this can result in sporadic agent error logs that a given lxc* device has already disappeared when the agent tries to attach, but Delete() resp the CNI Del finished. We've seen this in particular in case of lxc_health. The health endpoint is not managed by the CNI plugin, but the agent instead. What appears to happen is that when health endpoint is cleaned, it's removed from the ep manager, and subsequently the device is removed.. however any ongoing regenerate() later tries to attach to a non-existing device. Given the Delete() updates the endpoint state, we also need to wrap this with the build mutex lock. This is to wait for any pending regenerate() to finish before setting the new ep state, and any new build requests will bail out early as they see that the endpoint is disconnecting. Kudos to Tom Hadlaw and Robin Goegge! Fixes: cilium#32689 Co-developed-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 2dad63c ] The endpoints buildMutex is commented as intending to be held while doing endpoint build as well as delete. However, this lock is not actually held during (*Endpoint).Delete(). As a result, there is a race condition that can occur where an endpoint in the process of doing `e.owner.GetCompilationLock().RLock()` may be deleted while waiting for this lock. This is made more likely as the write lock on the compilation lock can be held for several minutes at times. When the regenerateBPF finally does acquire the lock, it may find that it's underlying interface has already been deleted. Specifically, in the case of the health endpoints lxc_health interface, this is removed after some ping timeout and endpoint-manager is called to delete the endpoint. However if the endpoint waiting for the compilation lock at the same time it will still attempt to regenerate the endpoint once the rlock becomes available, leading to errors such as: ``` msg="Error while reloading endpoint BPF program" ciliumEndpointName=/ containerID= containerInterface= datapathPolicyRevision=0 desiredPolicyRevision=1 endpointID=62 error="retrieving device lxc_health: Link not found" identity=4 ipv4=10.244.0.18 ipv6="fd00:10:244::9647" k8sPodName=/ subsys=endpoint ``` This adds the build mutex locking to (*Endpoint).Delete(), such that the endpoint cannot deleted while it is being regenerated. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
671e3fb
to
71f3700
Compare
/test-backport-1.15 |
squeed
approved these changes
Oct 30, 2024
/ci-clustermesh |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
backport/1.15
This PR represents a backport for Cilium 1.15.x of a PR that was merged to main.
kind/backports
This PR provides functionality previously merged into master.
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backporting to stable releases as deleting endpoint without taking the locks could lead to redirects being deleted while regeneration is waiting for an ACK from Envoy for a Network Policy update. Depending on how events line up it is possible that the Envoy listener gets deleted before it progresses to send the ACK, which would leave the regeneration to wait until the 5.5 minute timeout and then spam the logs with regeneration failure messages.
Once this PR is merged, a GitHub action will update the labels of these PRs: