Skip to content

Conversation

joestringer
Copy link
Member

The locking order for metadata -> IPCache should be first grabbing the
metadata lock then the IPCache lock, according to the documentation in
the metadata structure. Correct this lock ordering to conform to the
documented pattern.

This sort of improper lock ordering could theoretically cause a deadlock
if for instance the label injector ran at the same time as this function
is called.

Found by inspection.

Fixes: 40e13ea ("ipcache: Fix race in identity/ipcache release")
Fixes: #20721

The locking order for metadata -> IPCache should be first grabbing the
metadata lock then the IPCache lock, according to the documentation in
the metadata structure. Correct this lock ordering to conform to the
documented pattern.

This sort of improper lock ordering could theoretically cause a deadlock
if for instance the label injector ran at the same time as this function
is called.

Found by inspection.

Fixes: 40e13ea ("ipcache: Fix race in identity/ipcache release")
Signed-off-by: Joe Stringer <joe@cilium.io>
@joestringer joestringer requested a review from a team as a code owner October 3, 2022 19:48
@joestringer joestringer added affects/v1.12 This issue affects v1.12 branch release-note/bug This PR fixes an issue in a previous release of Cilium. labels Oct 3, 2022
@joestringer joestringer requested a review from aditighag October 3, 2022 19:48
@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. and removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Oct 3, 2022
@joestringer
Copy link
Member Author

/test

@christarazi
Copy link
Member

Marking for v1.11 backport so that we can include it in #21564.

@joestringer
Copy link
Member Author

joestringer commented Oct 3, 2022

/test

Job 'Cilium-PR-K8s-1.25-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathConfig Check BPF masquerading with ip-masq-agent DirectRouting

Failure Output

FAIL: Failed to add ip route

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.25-kernel-net-next so I can create one.

Copy link
Member

@aditighag aditighag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the commit description, I think the changes seem straight forward. Any chance a unit test can be added/extended in order to prevent such regressions in future? I'm looking at ipcache/metadata_test.go: TestRemoveLabelsFromIPs.

@joestringer
Copy link
Member Author

joestringer commented Oct 4, 2022

👍 reasonable ask in general, I'll have to think about it a bit more. The trick is that we're looking for a lock race condition, so depending on when the goroutines are scheduled it may not trigger. This sort of test is maybe more appropriate for a stress test that runs a bunch of logic constantly for some period. AFAIK we don't really have test infrastructure for that sort of issue currently.

For what it's worth, I did try a relatively naive test like this, but even with go test -tags=lockdebug ./pkg/ipcache/..., it would not trigger a failure:

diff --git a/pkg/ipcache/metadata_test.go b/pkg/ipcache/metadata_test.go
index 0bcb290560fe..67ae89a392ea 100644
--- a/pkg/ipcache/metadata_test.go
+++ b/pkg/ipcache/metadata_test.go
@@ -58,6 +58,49 @@ func TestInjectLabels(t *testing.T) {
        assert.False(t, IPIdentityCache.ipToIdentityCache["10.0.0.4/32"].ID.HasLocalScope())
 }
 
+func TestInjectLabelsConflictWithCIDR(t *testing.T) {
+       setupTest(t)
+       ctx := context.Background()
+
+       assert.Len(t, IPIdentityCache.metadata.m, 1)
+       remaining, err := IPIdentityCache.InjectLabels(ctx, []netip.Prefix{worldPrefix})
+       assert.Len(t, remaining, 0)
+       assert.NoError(t, err)
+       assert.Len(t, IPIdentityCache.ipToIdentityCache, 1)
+
+       wg := &sync.WaitGroup{}
+
+       wg.Add(1)
+       go func() {
+               // Insert kube-apiserver IP from outside of the cluster. This should create
+               // a CIDR ID for this IP.
+               for i := 0; i < 100; i++ {
+                       IPIdentityCache.metadata.upsertLocked(inClusterPrefix, source.KubeAPIServer, "kube-uid", labels.LabelKubeAPIServer)
+               }
+               wg.Done()
+       }()
+       wg.Add(1)
+       go func() {
+               for i := 0; i < 100; i++ {
+                       cidrs := []netip.Prefix{inClusterPrefix}
+                       if _, err := IPIdentityCache.AllocateCIDRs(cidrs, nil, nil); err != nil {
+                               break
+                       }
+               }
+               assert.NoError(t, err)
+               wg.Done()
+       }()
+
+       wg.Wait()
+       assert.Len(t, IPIdentityCache.metadata.m, 2)
+       remaining, err = IPIdentityCache.InjectLabels(ctx, []netip.Prefix{inClusterPrefix})
+       assert.NoError(t, err)
+       assert.Len(t, remaining, 0)
+       assert.Len(t, IPIdentityCache.ipToIdentityCache, 2)
+       assert.True(t, IPIdentityCache.ipToIdentityCache["10.0.0.4/32"].ID.HasLocalScope())
+
+}
+
 func TestFilterMetadataByLabels(t *testing.T) {
        setupTest(t)
 

@joestringer
Copy link
Member Author

Hit #21120, otherwise CI is passing. Reviews are in, merging.

@joestringer joestringer merged commit 021ea42 into cilium:master Oct 4, 2022
@joestringer joestringer deleted the submit/ipcache-deadlock-fix branch October 4, 2022 22:40
@gandro
Copy link
Member

gandro commented Oct 5, 2022

Is there any scenario where ipc.metadata.get is actually ever safe? It returns a reference type (map), but does not hold into the lock and the returned reference may still be mutated concurrently (e.g. by UpsertLabels).

@joestringer
Copy link
Member Author

@gandro you're right, it's not safe.

@codablock
Copy link
Contributor

@joestringer shouldn't the same fix be applied to releaseCIDRIdentities as well? I came across this fix by analysing #21596 (I encountered the same deadlock). I'm not sure if applying the metadata lock to releaseCIDRIdentities will also fix #21596, but it looks connected. I'll post a more detailed analysis of the stack traces in the issue later.

gandro added a commit to gandro/cilium that referenced this pull request Oct 6, 2022
As [previously discussed][1], the `ipc.metadata.get` method is not safe,
as it returns a reference to a map which could be mutated, without
holding on to the lock protecting that map. This commit therefore
removes that method and ensures that the caller holds on to the lock
while the data is being read. The `ToLabels` method creates a copy of
the data, so it is safe to be read after the metadata lock has been
released.

[1]: cilium#21565 (comment)

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@joestringer
Copy link
Member Author

joestringer commented Oct 6, 2022

@codablock are you looking for #20721 or something more than that?

The change here relates to the conflict between the ipcache overall lock and the ipcache metadata lock. The ipcache metadata lock is not used during release.

EDIT: Ah I see more detailed analysis in #21596, we can follow up there.

@codablock
Copy link
Contributor

@joestringer What I wrote was just a wild guess before I did the detailed analysis of the stack traces, so you can ignore my comment in this PR.

sayboras pushed a commit that referenced this pull request Oct 7, 2022
As [previously discussed][1], the `ipc.metadata.get` method is not safe,
as it returns a reference to a map which could be mutated, without
holding on to the lock protecting that map. This commit therefore
removes that method and ensures that the caller holds on to the lock
while the data is being read. The `ToLabels` method creates a copy of
the data, so it is safe to be read after the metadata lock has been
released.

[1]: #21565 (comment)

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@joestringer
Copy link
Member Author

Given this doesn't backport cleanly, I've removed the labels and assigned myself. When the other subsequent fixes are ready I can prepare the backports (or collaborate with Chris on the 1.11 variation)

@sayboras sayboras mentioned this pull request Oct 8, 2022
8 tasks
@sayboras sayboras added backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. and removed backport-pending/1.12 labels Oct 10, 2022
sayboras pushed a commit to sayboras/cilium that referenced this pull request Oct 10, 2022
[ upstream commit 820e348 ]

As [previously discussed][1], the `ipc.metadata.get` method is not safe,
as it returns a reference to a map which could be mutated, without
holding on to the lock protecting that map. This commit therefore
removes that method and ensures that the caller holds on to the lock
while the data is being read. The `ToLabels` method creates a copy of
the data, so it is safe to be read after the metadata lock has been
released.

[1]: cilium#21565 (comment)

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>
@sayboras sayboras added backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. and removed backport-pending/1.11 labels Oct 11, 2022
sayboras pushed a commit that referenced this pull request Oct 11, 2022
[ upstream commit 820e348 ]

As [previously discussed][1], the `ipc.metadata.get` method is not safe,
as it returns a reference to a map which could be mutated, without
holding on to the lock protecting that map. This commit therefore
removes that method and ensures that the caller holds on to the lock
while the data is being read. The `ToLabels` method creates a copy of
the data, so it is safe to be read after the metadata lock has been
released.

[1]: #21565 (comment)

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/v1.12 This issue affects v1.12 branch backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants