Skip to content

Conversation

joamaki
Copy link
Contributor

@joamaki joamaki commented Jan 23, 2024

The runtime device detection feature requires refreshing the NodePort services if the frontend addresses change (e.g. the IP addresses assigned to the node). This was implemented under daemon/cmd by watching device changes (DeviceManager.Listen), but now the node addresses are available as Table[NodeAddress] and can be directly watched from pkg/service.

This PR implements a reconciliation loop in pkg/service that watches the NodeAddress table and refreshes the services when addresses change. It also mitigates the race condition where ParseService might use stale addresses by periodically checking all services for missing NodePort frontends.

The changes this PR makes are also tested by the ginko test Checks device reconfiguration in test/k8s/services.go which creates a new vxlan device connected to a non-cilium node and tests that a NodePort service can be accessed via the new IP assigned to it.

Side note: the race condition with ParseService should really be addressed by moving the handling of the actual NodePort frontend IPs way down the stack, either to the datapath layer ("lbmap") or potentially even to BPF code. That's a larger refactoring though that also has implications to the /service REST API, so not feasible to tackle it currently.

Side note 2: this new code does not check for --runtime-device-detection. We're going to make that a no-op and eventually remove the flag.

This PR is also part of the general effort to get rid of daemon/cmd/device-reloader.go and instead have each component deal with reconciling it's state when devices or addresses change.

NodePort service frontends are now automatically updated when node's IP addresses change. This may have an impact to NodePort services manually added via the cilium-dbg tool if the used frontend IP is not assigned on the node.

Useful in combination with Collect/CollectSet/Map:

    var iter statedb.Iterator[MyObject]
    iter = statedb.Filter(iter, func(o MyObject) bool { return o.interesting })
    iter = statedb.Map(iter, func(o MyObject) Attr { return o.Attr })
    objs := statedb.Collect(iter)

Signed-off-by: Jussi Maki <jussi@isovalent.com>
When node addresses change we need to synchronize the NodePort frontends.
This commit removes the earlier version in device-reloader.go and adds
a reconciler to the service package. It also addresses the race with
ParseService() by periodically checking that there are no unexpected service
frontends.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
As NodePort services by definition are serviced only from the node's IP addresses
we cannot use them in the nat46x64 test with an IP address not assigned to the node,
but must instead use LoadBalancer service.

This fixes the test failure that was caused by the newly added reconciliation of 
the NodePort frontends.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 23, 2024
@joamaki joamaki added the release-note/misc This PR makes changes that have no direct user impact. label Jan 23, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 23, 2024
@joamaki joamaki added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label Jan 23, 2024
@joamaki joamaki requested a review from borkmann January 23, 2024 10:42
@joamaki
Copy link
Contributor Author

joamaki commented Jan 23, 2024

There's two big changes this introduces that we should review here:

  1. The NodePort frontend IPs are now automatically updated to reflect the new node IPs assigned for NodePort (by default the "primary" IPv4 & IPv6 of each externally facing device, but this can be controlled with --nodeport-addresses).
  2. NodePort services added via cilium-dbg may have their frontend IPs removed if those are not locally assigned or are not picked for NodePort use (may impact some existing users).

The automatic reaction to IP changes is something that we're trying to push for v1.16, essentially making --enable-runtime-device-detection a no-op. This does create some potential regression risk, but is solving very very long-standing technical debt. We could potentially still have --enable-runtime-device-detection gate-keeping this behavior, but it does make things very hacky, so I would rather try and going for the right solution for v1.16.

@joamaki joamaki marked this pull request as ready for review January 23, 2024 10:46
@joamaki joamaki requested review from a team as code owners January 23, 2024 10:46
@christarazi christarazi added area/daemon Impacts operation of the Cilium daemon. upgrade-impact This PR has potential upgrade or downgrade impact. area/loadbalancing Impacts load-balancing and Kubernetes service implementations labels Jan 23, 2024
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, very clean implementation. Happy to see health check reporting being used more as well.

I've added the upgrade-impact label to signal if we need to take any action aside from warning users in the upgrade guide. It's not clear to me if there's a need for that.

@joamaki joamaki added this pull request to the merge queue Jan 24, 2024
Merged via the queue into main with commit 355f2ff Jan 24, 2024
@joamaki joamaki deleted the pr/joamaki/nodeport-frontend-reconciler branch January 24, 2024 14:38
oblazek added a commit to oblazek/cilium that referenced this pull request Feb 23, 2024
With the recent changes in NodePort reconciliation
(see cilium#30374) it is
needed to switch service type from --k8s-node-port to
--k8s-load-balancer as the VIP is not assigned to the node.

Signed-off-by: Ondrej Blazek <ondrej.blazek@firma.seznam.cz>
github-merge-queue bot pushed a commit that referenced this pull request Feb 23, 2024
With the recent changes in NodePort reconciliation
(see #30374) it is
needed to switch service type from --k8s-node-port to
--k8s-load-balancer as the VIP is not assigned to the node.

Signed-off-by: Ondrej Blazek <ondrej.blazek@firma.seznam.cz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/daemon Impacts operation of the Cilium daemon. area/loadbalancing Impacts load-balancing and Kubernetes service implementations release-note/minor This PR changes functionality that users may find relevant to operating Cilium. release-note/misc This PR makes changes that have no direct user impact. upgrade-impact This PR has potential upgrade or downgrade impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants