endpoint: don't propagate health/ingress endpoints to kvstore #35997
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Health and ingress IPs are propagated to the other Cilium agents via CiliumNodes, and the equivalent kvstore representation. However, they are also additionally upserted into the kvstore as endpoints, leading to information duplication both in the kvstore and inside the user-space ipcache representation of all remote nodes (as observed via the
cilium ip list
command). Indeed, they get upserted both as single IPs (i.e., without netmask) when observed from the endpoint prefix, and as prefix (with /32 mask) when observed from the node prefix. The same does not happen when operating in CRD mode, because the corresponding CEPs do not get created in these cases.Let's fix this divergence by avoiding to upsert these entries in the kvstore case as well. Considering an upgrade scenario, the stale health/ingress entries will be automatically deleted when the corresponding lease expires (by default after 15 minutes). Still, this does not create any problems, because all other agents would observe the deletion event, clean-up the duplicate internal entries, but not propagate the deletion event down to the datapath (and the other subsystems), given that another CIDR entry for the same IP is still present [1], hence preserving correctness.
[1]:
cilium/pkg/ipcache/ipcache.go
Lines 428 to 431 in 40dde8b