Skip to content

daemon: Simplify cilium_host IP restoration #28781

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

gandro
Copy link
Member

@gandro gandro commented Oct 25, 2023

Before this PR, the cilium_host (aka router) IP restoration was done in two steps:

The first step collected previous IPs from the filesystem and Kubernetes and attempted to guess which one might still be valid in the new IPAM configuration. This validation was done using either the Pod CIDR, the VPC CIDR, or the native routing CIDR.

In a second step, the IP was then actually re-allocated via the IPAM subsystem. That second step could however still fail - just because an IP is part of the VPC CIDR or native routing CIDR does not mean it actually still assigned to the node.

This PR attempts to simplify the logic: Instead of trying to guess if the restore IP(s) could be allocated using IPAM, we just try to allocate them. If the allocation fails, we still have a fall back logic, and the old IP address(es) are still removed from the cilium_host interface. This way, the old CIDR check becomes obsolete, as we now use the IPAM subsystem as the source of truth if an IP can be restored or not.

This also fixes a bug where Multi-Pool IPAM always reallocated the router IP, because it did not implement the CIDR check in the first step.

Review per commit.

Fixes: #28743

@gandro gandro added area/daemon Impacts operation of the Cilium daemon. release-note/misc This PR makes changes that have no direct user impact. sig/ipam area/ipam IP address management, including cloud IPAM labels Oct 25, 2023
@gandro gandro force-pushed the pr/gandro/simplify-cilium-host-ip-restoration branch from 143491c to 6ffb771 Compare October 25, 2023 12:09
@gandro
Copy link
Member Author

gandro commented Oct 25, 2023

I've tested this code locally with multi-pool and cluster-pool IPAM modes.

Two open questions:

  • This fixes a bug in multi-pool IPAM. Should we backport this to v1.14, or is it too risky? I can also workaround the bug in v1.14 by using a fake CIDR for multi-pool IPAM
  • Does restoration from K8s even work? We fetch the k8s Node object (not CiliumNode) and it no longer seems to contain an annotation with the router IP? And even with clusterpool, we call UpdateCiliumNodeResource before WaitForNodeInformation, which simply clears out the old router IP (and ignores the router IP from the Node resource if it was present, but it anyway doesn't seem to be). At least with Kind, I have not been able to get it to work both before and after this PR

@gandro gandro requested a review from christarazi October 25, 2023 12:21
@gandro gandro marked this pull request as ready for review October 25, 2023 12:22
@gandro gandro requested review from a team as code owners October 25, 2023 12:22
@gandro
Copy link
Member Author

gandro commented Oct 25, 2023

/test

@gandro gandro force-pushed the pr/gandro/simplify-cilium-host-ip-restoration branch from 6ffb771 to e0aac0b Compare October 25, 2023 12:52
@gandro
Copy link
Member Author

gandro commented Oct 25, 2023

/test

@gandro
Copy link
Member Author

gandro commented Oct 25, 2023

Required CI passed, but there is a typo in one of the commit messages. Repushing.

@gandro gandro force-pushed the pr/gandro/simplify-cilium-host-ip-restoration branch from e0aac0b to 932a2fe Compare October 25, 2023 14:38
@gandro
Copy link
Member Author

gandro commented Oct 25, 2023

/test

@gandro
Copy link
Member Author

gandro commented Oct 25, 2023

GKE and external workload failures look legitimate. Will look into it.

@gandro
Copy link
Member Author

gandro commented Oct 26, 2023

GKE and external workload failures look legitimate. Will look into it.

On closer look, the failures actually seem unrelated, and the more likely explanation is a minor GKE outage yesterday:

  • conformance-gke 3 of the 4 jobs timed out. This is similar to a run on main from around the same timeframe. In particular, they all timed out during the health test due to it being unable to exec into the cilium pods. The only job which didn't time out hit similar issues communicating with the kubeapiserver, i.e. we see multiple occurrences of communication failures between the CI test driver and the GKE apiserver.
  • ci-external-workloads also seemed to have issues of the form error getting cilium-agent logs for kube-system/cilium-6f7qt: pods "cilium-6f7qt" not found. Notably, that workflow also runs on GKE, so it will suffer from GKE infrastructure outages too.

I've checked the sysdump of the failed jobs and did not see anything related to this PR. In fact, none of these GKE-based workflows did restart the Cilium pods before they failed, so no router restoration was performed. It should also be noted that the same pipelines were previously green on a very similar version of the branch.


  • ci-upgrade-ipsec is a non-required workflow that is currently known to be flakey, but I still looked into it since it is one of the few workflows where we actually restart the Cilium pods. However, both of the failed jobs (1/3 was green) failed during downgrade, i.e. the failed pods were running Cilium 1.14.3 which does not have these changes. Since the code here does not change how we store the router IP on disk, it should not affect downgrade either. Indeed, we can see that both downgraded cilium instances were able to restore the router IP without issue.

Considering the above, I'm fairy confident the failures are unrelated. I'll restart CI.

@gandro
Copy link
Member Author

gandro commented Oct 26, 2023

/ci-gke

@gandro
Copy link
Member Author

gandro commented Oct 26, 2023

/ci-external-workloads

@gandro
Copy link
Member Author

gandro commented Oct 26, 2023

/ci-ipsec-upgrade

@gandro
Copy link
Member Author

gandro commented Oct 30, 2023

/ci-ipsec-upgrade

The last IPsec failiure looks a bit suspicious, as it is running the code from main. In particular that it restored the old router IP, but then still finds old state in cilium_host to cleanup. I might have to a few more debug log statements to get to the cause of this.

@gandro gandro force-pushed the pr/gandro/simplify-cilium-host-ip-restoration branch from 932a2fe to 26308e2 Compare October 30, 2023 14:48
@gandro
Copy link
Member Author

gandro commented Oct 30, 2023

/ci-ipsec-upgrade

@gandro gandro force-pushed the pr/gandro/simplify-cilium-host-ip-restoration branch from 26308e2 to 6a046cc Compare October 31, 2023 12:22
@gandro
Copy link
Member Author

gandro commented Oct 31, 2023

/ci-ipsec-upgrade

@gandro gandro force-pushed the pr/gandro/simplify-cilium-host-ip-restoration branch from 6a046cc to ce889cf Compare October 31, 2023 12:55
@gandro
Copy link
Member Author

gandro commented Oct 31, 2023

/ci-ipsec-upgrade

@gandro
Copy link
Member Author

gandro commented Nov 2, 2023

Rebased due to conflicts again. Waiting on reviews.

ci-ipsec-upgrade failures seem to be unrelated, it also passed recently. I've investigated multiple sysdumps and did not see anything that points towards router IP restoration not working. The workflow is rather unstable on main too.

@gandro gandro force-pushed the pr/gandro/simplify-cilium-host-ip-restoration branch from ce889cf to 8842226 Compare November 2, 2023 13:14
@gandro
Copy link
Member Author

gandro commented Nov 2, 2023

/test

@tommyp1ckles
Copy link
Contributor

Does restoration from K8s even work?

Probably something worth bringing up at the next community meeting 🤔

@tommyp1ckles
Copy link
Contributor

Changes look good on my end, probably worth discussing the k8s restore question (maybe we can just officially get rid of this?).

@gandro
Copy link
Member Author

gandro commented Nov 6, 2023

Many CI failures due to cloudflare outage. Rebasing and restarting.

@gandro gandro force-pushed the pr/gandro/simplify-cilium-host-ip-restoration branch from 8842226 to d441afa Compare November 6, 2023 12:37
@gandro
Copy link
Member Author

gandro commented Nov 6, 2023

/test

Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for this, much better looking code now!

A couple of nits below.

Does restoration from K8s even work? We fetch the k8s Node object (not CiliumNode) and it no longer seems to contain an annotation with the router IP? And even with clusterpool, we call UpdateCiliumNodeResource before WaitForNodeInformation, which simply clears out the old router IP (and ignores the router IP from the Node resource if it was present, but it anyway doesn't seem to be). At least with Kind, I have not been able to get it to work both before and after this PR

I'm fairly certain that this was not the case when the K8s restoration code was added. This restoration logic relies on the annotation to be set. So if somewhere else in Cilium stopped setting the annotation, then that is what needs to be investigated. If we no longer set the annotation at all, then we should probably remove the K8s restoration code because we then have no mechanism to restore state from K8s.

@gandro gandro force-pushed the pr/gandro/simplify-cilium-host-ip-restoration branch from d441afa to cecc24d Compare November 7, 2023 09:12
@gandro
Copy link
Member Author

gandro commented Nov 7, 2023

Thanks for the reviews! I believe I've address the feedback!

I'm fairly certain that this was not the case when the K8s restoration code was added. This restoration logic relies on the annotation to be set. So if somewhere else in Cilium stopped setting the annotation, then that is what needs to be investigated. If we no longer set the annotation at all, then we should probably remove the K8s restoration code because we then have no mechanism to restore state from K8s.

I'll create an issue (edit: #29019). I think the annotation is indeed no longer set for at least one release, and this has been reported before, though in that case for the pod CIDR: #24420

IMO, we should attempt to restore the router IP from the local CiliumNode object, rather than adding annotations to the K8s Node object. But that's a separate discussion.

The `restoreCiliumHostIPs` function does two things: Determine which IP
to restore, and then remove any IPs from the cilium_host device which
are not the restored one.

This commit splits out the second part, so it can be retried
independently. There is no reason to retry the first part, as it is
deterministic.

This commit prepares the code for further changes following in
subsequent commits.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
This commit contains no functional changes and prepares the code for
changes coming up in subsequent commits.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Before this commit, we were setting the restored IP during
`node.AutoComplete`, getting it `Daemon.restoreCiliumHostIPs`, only to
set it again in `node.RestoreHostIPs` if it passed validation.

This commit simplifies the logic to extract the IPs from the file
system, and only setting them after validation.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Before this commit, the `cilium_host` (aka router) IP restoration was
done in two steps:

The first step collected previous IPs from the filesystem and Kubernetes
and attempted to guess which one might still be valid in the new IPAM
configuration. This validation was done using either the Pod CIDR, the
VPC CIDR, or the native routing CIDR.

In a second step, the IP was then actually re-allocated via the IPAM
subsystem. That second step could however still fail - just because an
IP is part of the VPC CIDR or native routing CIDR does not mean it
actually still assigned to the node.

This commit attempts to simplify the logic: Instead of trying to guess
if the restore IP(s) could be allocated using IPAM, we just try to
allocate them. If the allocation fails, we still have a fall back logic,
and the old IP address(es) are still removed from the `cilium_host`
interface. This way, the old CIDR check becomes obsolete, as we now use
the IPAM subsystem as the source of truth if an IP can be restored or
not.

This also fixes a bug where Multi-Pool IPAM always reallocated the
router IP, because it did not implement the CIDR check in the first
step.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Before the previous commit, `removeOldRouterState` would only be called
with a restored IP address, which could be nil. Now however, it is
always called with the new router IP, which could have been restored, or
could have been a new allocation. In either case, we want to remove any
non-matching IPs from `cilium_host`.

However, the case with new allocations now also happens the first time
Cilium is started, in which case there is no `cilium_host` device yet.
This commit therefore does not treat a missing `cilium_host` as an
error. This fixes the following warning which was introduced by the
previous commit:

```
level=warning msg="Failed to remove old router IPs from cilium_host." attempt=1 error="Link not found\nLink not found" subsys=daemon
```

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
The previous commit made these functions obsolete.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
This slightly modifies the `reallocateDatapathIPs` function to take a
mock allocator as its first argument, and adds a basic unit test which
tests for the precedence logic we want.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
This helps troubleshooting potential bugs in CI.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@gandro gandro force-pushed the pr/gandro/simplify-cilium-host-ip-restoration branch from cecc24d to d9f60df Compare November 7, 2023 09:33
@gandro
Copy link
Member Author

gandro commented Nov 7, 2023

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 7, 2023
@gandro gandro merged commit 518d7cc into cilium:main Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/daemon Impacts operation of the Cilium daemon. area/ipam IP address management, including cloud IPAM ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multi-pool reallocs router IP on cilium agent restart
3 participants