-
Notifications
You must be signed in to change notification settings - Fork 3.4k
bgpv2,operator: Fix the race condition in the nodeSelector conflict detection logic #35690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
julianwiedmann
merged 2 commits into
cilium:main
from
YutaroHayakawa:deflake-bgpv2-conflicting-cluster-config-test
Nov 7, 2024
Merged
bgpv2,operator: Fix the race condition in the nodeSelector conflict detection logic #35690
julianwiedmann
merged 2 commits into
cilium:main
from
YutaroHayakawa:deflake-bgpv2-conflicting-cluster-config-test
Nov 7, 2024
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Currently, we're not tracking the CiliumBGPNodeConfig changes. We must track it to recover from the nodeSelector conflict for example. Without tracking it, the scenario like the following is possible: 1. NodeA is reffered from ClusterConfigA and ClusterConfigB. Currently, ClusterConfigB "owns" the node. 2. ClusterConfigB changed the nodeSelector. NodeA is no longer selected by ClusterConfigB. 3. ClusterConfigA suppose to create a new NodeConfig for NodeA, but the reconciliation was not triggered because we're not watching the NodeConfig changes. Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
4c84f25
to
4a06051
Compare
In the current implementation, we fetch the CiliumBGPNodeConfig for the node twice in the update process. One is in the getMatchingNodes and another is in the upsertNodeConfig. The former checks the owner reference, but the latter does not. As a result, if the NodeConfig cache is updated during those two, the operator may not catch the fact that the NodeConfig is already owned by someone else. And since we haven't watched the NodeConfig (see the previous commit), the operator will never have a chance to update the ConflictingClusterConfig condition. The concrete scenario would be: 1. ClusterConfigA and ClusterConfigB are created. Both selects the NodeA. 2. The operator reconciles ClusterConfigA and create a NodeConfig for the NodeA. 3. The operator reconciles ClusterConfigB next. In the getMatchingNodes, the operator didn't see the NodeConfig created by the reconciliation of the ClusterConfigA. 4. The NodeConfig cache is updated. However, it will not be notified to the operator since we don't watch the NodeConfig. 5. The operator calls upsertNodeConfig. It fetches the NodeConfig from the cache, but don't check the owner, so proceed with the DeepEqual check and since there was no diff, did nothing. 6. The reconciliation of ClusterConfigA and ClusterConfigB is done. The operator didn't see any conflict in this reconciliation cycle. 7. Since the NodeConfig cache update didn't trigger the reconciliation, the operator never notice the conflict until next reconciliation. This issue doesn't appear if the cache update is done before getMatchingNodes. The TestConflictingClusterConfigCondition was flaky due to this undeterministic behavior. This commit fixes the issue by combining the getMatchingNodes and upsertNodeConfig and fetching NodeConfig only once in the upsert logic. We made some cleanup for the existing logic as a part of this fix. Most of them are identical, but we removed the adhoc retry in the upsertNodeConfig and deleteStaleNodeConfig which is not required since we have a retry. Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
4a06051
to
607e918
Compare
rastislavs
approved these changes
Nov 5, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks!
/test |
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/bgp
Impacts the Border Gateway Protocol feature.
backport/author
The backport will be carried out by the author of the PR.
backport-done/1.16
The backport for Cilium 1.16.x for this PR is done.
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
release-note/bug
This PR fixes an issue in a previous release of Cilium.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Please see the commit description. The first commit should be backported to the v1.16 branch.