-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Improve dnsproxy ipcache handling, metrics #20721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve dnsproxy ipcache handling, metrics #20721
Conversation
9a750d3
to
40b55df
Compare
/test |
40b55df
to
b92ce4a
Compare
/test Job 'Cilium-PR-K8s-1.23-kernel-4.19' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment Job 'Cilium-PR-K8s-1.16-kernel-4.9' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me, thanks for the helpful commit msgs!
CI Triage:
|
/mlh new-flake Cilium-PR-K8s-1.23-kernel-4.19 👍 created #20723 |
/mlh new-flake Cilium-PR-K8s-1.16-kernel-4.9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comment as per below :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just some nitpicking on metric labels, otherwise lgtm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Observed an asymmetry of the new critical sections, but it seems inconsequential.
@@ -72,6 +73,7 @@ func (ipc *IPCache) AllocateCIDRs( | |||
newlyAllocatedIdentities[prefixStr] = id | |||
} | |||
} | |||
ipc.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be much easier to reason about if the forthcoming insertion to the ipcache was also covered by the lock, so that these new critical sections would be more symmetric. However, it appears that it would be safe if the releaseCIDRIdentities()
critical section executes right after this lock is released, as in this case the identity would not be "released" and hence not removed from the ipcache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, I meant to comment specifically about this. In the if upsert {...}
case this seems reasonable. The problem is, in the target situation I'm trying to fix here the ToFQDNs logic passes upsert=false
here and then does the UpsertGeneratedIdentities()
much later. I don't think it's viable to hold the lock for that entire period, we'd need to do some significant restructuring to really achieve this.
.. it appears that it would be safe if the releaseCIDRIdentities() critical section executes right after this lock is released, as in this case the identity would not be "released" and hence not removed from the ipcache.
That was pretty much my thinking. We'd either get:
- A: Lock
- A: ipc.IdentityAllocator.Release()
- A: ipc.deleteLocked()
- A: Unlock
- B: Lock
- B: ipc.allocate()
- B: Unlock
- B: Lock, Upsert, Unlock
or some variation of the below, potentially with the ipc.Upsert()
from B also pushed earlier before A:
- B: Lock
- B: ipc.allocate()
- B: Unlock
- A: Lock
- A: ipc.IdentityAllocator.Release()
- A: (no ipc.deleteLocked() since the identity has a ref from B)
- A: Unlock
- B: Lock, Upsert, Unlock
For now since I couldn't cover both cases, I just didn't bother to cover the UpsertGeneratedIdentities()
itself with the lock here. I could cover that function with the lock too, but it might just encourage assumptions around the locking consistency on this side which don't hold because the upsert is conditionally done later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^ Maybe longer term the real solution is to have the dnsproxy subsystem go to the identity ipcache subsystem and state "IP X needs an FQDN identity", then the ipcache can call out to the FQDN policy pieces, ensure that is all populated, then populate the ipcache itself. Then NotifyOnDNSMsg()
won't need to worry about this detail, it would just need to have a waitgroup or callback to ensure the ipcache is plumbed before releasing the DNS responses. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ That sounds like the ideal design we want to strive for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the above explanation also to the commit message.
b92ce4a
to
cebe666
Compare
A user has reported an issue where it appears that FQDN garbage collection
races with DNS proxy handling for IPs (typically with S3), where occasionally
the traffic towards a destination allowed by
ToFQDNs
policy is not correctlyallowing the traffic. In this scenario, the ipcache does not contain the target
IP for the connection, however the IP has a corresponding identity, and that
identity is plumbed into the policy packages and even into the BPF policymaps.
The first commit in this PR is intended to close this race condition to resolve
the underlying cause. The second commit improves the resiliency of the code by
ensuring that any time that a DNS request triggers policy updates, we attempt
to push corresponding identities into the IPcache again in case they were
temporarily removed by other code in the mean time. Finally, add some metrics
to help to debug issues in this area of the codebase.
For more details, see the individual commits.