-
Notifications
You must be signed in to change notification settings - Fork 3.4k
fqdn: Improve performance by using selectors as labels #32769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
60abc6e
to
396287b
Compare
/test |
Super clean, very impressive. Nice job! Excited for tests and (fingers crossed) performance results. |
c842f90
to
33330ac
Compare
33330ac
to
7400e34
Compare
/test |
7400e34
to
73c7ca1
Compare
73c7ca1
to
88b8823
Compare
This commit modifies the IPCache restoration to restore all local identity entries, not just CIDR identities. This is required because FQDN labels are derived from ToFQDN selectors, which are only available during endpoint regeneration. To ensure that identities of prefixes in IPCache don't change during initial regeneration, we provide the expected `fqdn` labels before regeneration. The real labels are added during regeneration, therefore the restored ones can be safely removed in `releaseRestoredIdentities`. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
This contians no functional changes and is a drive-by cleanup. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
A subsequent commit will change prefix labels upserted by the name manager to use `fqdn`-labels instead of `cidr`-labels. Because a CIDR identity currently always also have the world label, we want to mirror that logic for identities with an `fqdn` label, as such IPs allowed by a ToFQDN policy remains selectable by a `reserved:world` selector. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
This introduces a wait for the initial IPCache revision after K8s caches have synced. This ensures that all prefix labels are injected and available in the new IPCache before restoration starts. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
This commit adds logic to derive identity labels for `(name, IP)` pairs from selectors. The basic idea is that any ToFQDN selector matching the qname of the DNS lookup is added to a label to each IP returned by that DNS lookup. The functions added here will be used in a subsequent commit. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
This commit implements `CFP-28427: ToFQDN policy performance improvements`. It is highly recommended to consult the CFP, as it contains all the high-level design decisions and mechanism found in this commit. The rest of this commit message therefore only explains the "what" and "where", and not the "why". Before this commit, there was circular interaction between the `SelectorCache` and `NameManager`: `SelectorCache` would tell `NameManager` about new `ToFQDN` selectors, and `NameManager` would in turn inform `SelectorCache` about the IPs selected by that `ToFQDN` selector. This commit simplifies this logic by removing the backlink from the `NameManager` to the `SelectorCache`. IPs are instead now labelled with the selector as an `fqdn` identity label in IPCache, thus not requiring any direct changes to the `SelectorCache` when a new IP is discovered that shares the identity with an old IP. If there is identity allocation needed for an observed IP, the `SelectorCache` is still updated, but only via `IPCache`, and no longer directly from `NameManager`. I recommend first looking at the changes to `SelectorCache` in `pkg/policy`. Note the following changes: 1. The `identityNotifier` interface (implemented by `NameManger`) is simplified: We no longer care about IPs selected by a FQDN selector, and we no longer need to care about potential deadlocks, as there are no calls back from `NameManager` to `SelectorCache` in the invoked functions (the indirect backlink from `NameManager` to `SelectorCache` via `IPCache` happens in `NameManager.UpdateGenerateDNS` - but this function is called by the DNS proxy whenever it observes a new DNS lookup and thus is called without the selector cache lock held. 2. `UpdateFQDNSelector` (previously invoked by `NameManager`) is removed - `SelectorCache` no longer directly needs to know the IPs matched by a selector. 3. The `fqdnSelector` type is simplified: Instead of containing the list of CIDR identities (one for each selected IP) and checking for the CIDR identity in `matches`, we now can simply treat the FQDN selector as a label and thus check if the requested identity has the FQDN selector label. 4. All the unit test logic around managing the selected IPs is removed, as all the responsibility for updating IPs now lies in `NameManager`. For the `NameManager` in `pkg/fqdn`, the changes are as follows: 1. Minor changes to for the query functions in `DNSCache`: Instead of just listing or checking the existence of an IP, we now want to know about `(name, IP)` pairs (needed later for updating `IPCache`). 2. Similarly, where before we only cared about the mapping between an `FQDNSelector` and the selected IPs, we now want to know what `(name, IP)` pairs are matched by a particular selector. Thus `mapSelectorsToIPsLocked` is replaced with `mapSelectorsToNamesLocked` and the unit tests are updated as well. 3. `RegisterFQDNSelector` now checks if the new selector needs to be added to any known `(name, IP)` pairs as an `fqdn` label, and `UnregisterFQDNSelector` potentially removes `fqdn` labels from IPs. 4. `UpdateGenerateDNS` (invoked for DNS lookups) determines the labels of any newly discovered IP and now directly spawns the go routine to wait for the new `(IP, identity)` pair to be injected into `IPCache`. Previously, this waiting was done as part of the call to `UpdateSelectors`, previously implemented in `daemon/cmd/fqdn.go` (and now removed). 5. `ForceGenerateDNS` is removed. It was previously called by the `NameManager` GC to remove IPs from the `SelectorCache`, but since the `SelectorCache` no longer knows about IPs, the function is obsolete (note that `IPCache` removals are still performed upon GC) 6. Changes in `CompleteBootstrap` to deal with the upgrade logic when upgrading from Cilium v1.15. See bullet point 9 below for details. 7. `updateDNSIPs` (called from `UpdateGenerateDNS`, i.e. upon new DNS lookups) now determines the labels for every newly observed IP based on the available FQDN selectors, and no longer upserts CIDR identites. Note that we only update the labels matching the looked up `dnsName`. If an IP happens to also map to a different domain name and uses a different set of selectors for the alternative name, those labels in IPCache are unaffected by the call to `updateMetadata`, as every call to IPCache uses the DNS name as the resource owner. 8. The `ipcacheResource`, `updateMetadata`, and `maybeRemoveMetadata` contain the calls to `IPCache` to update labels for a given `(name, IP)` pair. There are two main differences to before: Instead of upserting or removing CIDR prefixes, we now add labels. And instead of having one update per prefix, we now have one update per `(name, IP)` pair, meaning a single prefix (aka "IP") might have multiple IPCache resource owners in the `NameManager` (i.e. one for each `name` mapping to that IP). 9. `RestoreCache` and `CompleteBootstrap` contain the logic to initialize `IPCache` when upgrading from Cilium v1.15. This requires the previous Cilium instance to have checkpointed the known `ToFQDN` selectors, which are read in during upgrade and used to derive and inject the `IPCache` labels we expect to have once endpoint regeneration has finished. After endpoint regeneration, those restored labels are then removed, leaving the real labels in place. In contrast to all other `IPCache` updates (where each update to an IP is "owned" by the DNS name mapping to that IP, and we rely on `IPCache` to merge those labels), the resource owner here is static. This is, because they are all added at once (in `RestoreCache`) and removed at once (in `CompleteBootstrap`), and no per-name tracking is required. 10. Various changes to unit tests. The old unit tests tested the interaction between `NameManager` and `SelectorCache`, where as the new unit tests now test the interaction between `NameManager` and `IPCache`. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
38fd5e4
to
6f892a7
Compare
/test Edit: CI is pretty unstable due to quay throwing |
ci-ipsec-upgrade fails with |
/ci-ipsec-upgrade |
FQDN max ips per host is a circuit-breaker which, in the past, prevented cilium's policy engine from becoming overwhelmed by the number of identities allocated due to IPs learned via toFQDN policies. Thanks to recent performance optimization work[1], the default value for this doesn't represent a sensible value anymore, since it is no longer the case that each IP learned via DNS lookup becomes its own identity. [1] cilium#32769 Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
FQDN max ips per host is a circuit-breaker which, in the past, prevented cilium's policy engine from becoming overwhelmed by the number of identities allocated due to IPs learned via toFQDN policies. Thanks to recent performance optimization work[1], the default value for this doesn't represent a sensible value anymore, since it is no longer the case that each IP learned via DNS lookup becomes its own identity. [1] cilium#32769 Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
FQDN max ips per host is a circuit-breaker which, in the past, prevented cilium's policy engine from becoming overwhelmed by the number of identities allocated due to IPs learned via toFQDN policies. Thanks to recent performance optimization work[1], the default value for this doesn't represent a sensible value anymore, since it is no longer the case that each IP learned via DNS lookup becomes its own identity. [1] cilium#32769 Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
FQDN max ips per host is a circuit-breaker which, in the past, prevented cilium's policy engine from becoming overwhelmed by the number of identities allocated due to IPs learned via toFQDN policies. Thanks to recent performance optimization work[1], the default value for this doesn't represent a sensible value anymore, since it is no longer the case that each IP learned via DNS lookup becomes its own identity. [1] cilium#32769 Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
FQDN max ips per host is a circuit-breaker which, in the past, prevented cilium's policy engine from becoming overwhelmed by the number of identities allocated due to IPs learned via toFQDN policies. Thanks to recent performance optimization work[1], the default value for this doesn't represent a sensible value anymore, since it is no longer the case that each IP learned via DNS lookup becomes its own identity. [1] cilium#32769 Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
FQDN max ips per host is a circuit-breaker which, in the past, prevented cilium's policy engine from becoming overwhelmed by the number of identities allocated due to IPs learned via toFQDN policies. Thanks to recent performance optimization work[1], the default value for this doesn't represent a sensible value anymore, since it is no longer the case that each IP learned via DNS lookup becomes its own identity. [1] #32769 Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
This is an implementation of the following CFP. Please refer to the CFP for details on the approach.
It is highly recommended to consult the CFP, as it contains all the high-level design decisions and mechanism found in these commits. The commit messages themselves only consider low-level details. The main commit is rather large, as I was unable to split it into smaller chunks that would the test suite by themselves. However, it contains a commit message explaining each code chunk in the diff.
Follow-up work not intended to be part of this PR:
Instead of collecting all matching selectors and every one as a label, extract the least generic available selector (e.g.matchName: cilium.io
instead ofmatchPattern: *
) for a given domain and use that one, as it should reduce identity churn. This requires us to teach the selector cache about selector hierarchy.Pre-allocating an identity for a registered selector, rather than waiting for the first IP to be observedfqdn: Pre-allocate ToFQDN selector identities #33068 (comment)