Skip to content

Conversation

giorio94
Copy link
Member

A crash of the agent can occur when processing a newly imported CiliumNetworkPolicy, if the idCache map is accessed without a lock during a concurrent write. This fix adds the acquisition of a read lock to guard that operation.

No backport is required, since it affects only master.

Fixes: #24021

Fix concurrent map read/write race condition after importing CNP

@giorio94 giorio94 added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. labels Feb 27, 2023
@giorio94 giorio94 requested a review from a team as a code owner February 27, 2023 08:09
@giorio94 giorio94 requested a review from nebril February 27, 2023 08:09
@giorio94
Copy link
Member Author

Closing and reopening to retrigger the tests, since they failed due to an error while building the images, caused by DockerHub connectivity issues:

ERROR: failed to solve: docker.io/library/alpine:3.17.2@sha256:69665d02cb32192e52e07644d76bc6f25abeb5410edc1c7a81a10ba3f0efb90a: failed to do request: Head "https://registry-1.docker.io/v2/library/alpine/manifests/sha256:69665d02cb32192e52e07644d76bc6f25abeb5410edc1c7a81a10ba3f0efb90a": net/http: TLS handshake timeout

@giorio94 giorio94 closed this Feb 27, 2023
@giorio94 giorio94 reopened this Feb 27, 2023
@giorio94 giorio94 marked this pull request as draft February 27, 2023 10:14
A crash of the agent can occur when processing a newly imported
CiliumNetworkPolicy, if the idCache map is accessed without a lock
during a concurrent write. This fix adds the acquisition of a read
lock to guard that operation.

Fixes: c9f0def ("policy: Fix Deny Precedence Bug")
Fixes: cilium#24021

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
@giorio94 giorio94 force-pushed the mio/get-labels-lock branch from 250850f to df15365 Compare March 9, 2023 09:21
Comment on lines -201 to -202
p.selectorPolicy.SelectorCache.mutex.Lock()
defer p.selectorPolicy.SelectorCache.mutex.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this can be done. It will introduce race conditions with concurrent identity updates as the removed comments states. ConsumeMapChanges doesn't have an explicit dependency on the SelectorCache, but it's implicit. I can understand why someone might think that these statements can be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my doubt too. It wasn't clear to me whether it was still necessary after the rework, since there seemed to be no explicitly dependency. Thanks for the clarification

@giorio94
Copy link
Member Author

Superseded by #24322

@giorio94 giorio94 closed this Mar 13, 2023
@giorio94 giorio94 deleted the mio/get-labels-lock branch March 13, 2023 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

concurrent map read and map write panic after importing CiliumNetworkPolicy
2 participants