Skip to content

bgpv2,operator: Fix the race condition in the nodeSelector conflict detection logic #35690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

YutaroHayakawa
Copy link
Member

@YutaroHayakawa YutaroHayakawa commented Nov 1, 2024

Please see the commit description. The first commit should be backported to the v1.16 branch.

bgpv2,operator: Fix the race condition in the nodeSelector conflict detection logic

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 1, 2024
@YutaroHayakawa YutaroHayakawa added release-note/bug This PR fixes an issue in a previous release of Cilium. backport/author The backport will be carried out by the author of the PR. needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Nov 1, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Nov 1, 2024
Currently, we're not tracking the CiliumBGPNodeConfig changes. We must
track it to recover from the nodeSelector conflict for example. Without
tracking it, the scenario like the following is possible:

1. NodeA is reffered from ClusterConfigA and ClusterConfigB. Currently,
   ClusterConfigB "owns" the node.
2. ClusterConfigB changed the nodeSelector. NodeA is no longer selected
   by ClusterConfigB.
3. ClusterConfigA suppose to create a new NodeConfig for NodeA, but the
   reconciliation was not triggered because we're not watching the
   NodeConfig changes.

Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
@YutaroHayakawa YutaroHayakawa force-pushed the deflake-bgpv2-conflicting-cluster-config-test branch 2 times, most recently from 4c84f25 to 4a06051 Compare November 5, 2024 06:36
@YutaroHayakawa YutaroHayakawa marked this pull request as ready for review November 5, 2024 06:36
@YutaroHayakawa YutaroHayakawa requested a review from a team as a code owner November 5, 2024 06:36
In the current implementation, we fetch the CiliumBGPNodeConfig for the
node twice in the update process. One is in the getMatchingNodes and
another is in the upsertNodeConfig. The former checks the owner
reference, but the latter does not.

As a result, if the NodeConfig cache is updated during those two, the
operator may not catch the fact that the NodeConfig is already owned by
someone else. And since we haven't watched the NodeConfig (see the
previous commit), the operator will never have a chance to update the
ConflictingClusterConfig condition. The concrete scenario would be:

1. ClusterConfigA and ClusterConfigB are created. Both selects the
   NodeA.
2. The operator reconciles ClusterConfigA and create a NodeConfig for
   the NodeA.
3. The operator reconciles ClusterConfigB next. In the getMatchingNodes,
   the operator didn't see the NodeConfig created by the reconciliation
   of the ClusterConfigA.
4. The NodeConfig cache is updated. However, it will not be notified to
   the operator since we don't watch the NodeConfig.
5. The operator calls upsertNodeConfig. It fetches the NodeConfig from
   the cache, but don't check the owner, so proceed with the DeepEqual
   check and since there was no diff, did nothing.
6. The reconciliation of ClusterConfigA and ClusterConfigB is done. The
   operator didn't see any conflict in this reconciliation cycle.
7. Since the NodeConfig cache update didn't trigger the reconciliation,
   the operator never notice the conflict until next reconciliation.

This issue doesn't appear if the cache update is done before
getMatchingNodes. The TestConflictingClusterConfigCondition was flaky
due to this undeterministic behavior.

This commit fixes the issue by combining the getMatchingNodes and
upsertNodeConfig and fetching NodeConfig only once in the upsert logic.

We made some cleanup for the existing logic as a part of this fix. Most
of them are identical, but we removed the adhoc retry in the
upsertNodeConfig and deleteStaleNodeConfig which is not required since
we have a retry.

Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
@YutaroHayakawa YutaroHayakawa force-pushed the deflake-bgpv2-conflicting-cluster-config-test branch from 4a06051 to 607e918 Compare November 5, 2024 06:51
Copy link
Contributor

@rastislavs rastislavs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

@YutaroHayakawa
Copy link
Member Author

/test

@YutaroHayakawa
Copy link
Member Author

Conformance K8s Kind / Installation and Conformance Test: #35779
Cilium E2E Upgrade: #35774
Conformance Ginkgo: #35780

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 5, 2024
@julianwiedmann julianwiedmann added the area/bgp Impacts the Border Gateway Protocol feature. label Nov 6, 2024
@julianwiedmann julianwiedmann added this pull request to the merge queue Nov 7, 2024
Merged via the queue into cilium:main with commit 53a83df Nov 7, 2024
64 checks passed
@YutaroHayakawa YutaroHayakawa mentioned this pull request Nov 8, 2024
1 task
@YutaroHayakawa YutaroHayakawa added backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. and removed needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Nov 8, 2024
@github-actions github-actions bot added backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. labels Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/bgp Impacts the Border Gateway Protocol feature. backport/author The backport will be carried out by the author of the PR. backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants