Skip to content

Regression: initialization of a single-stack IPv6 cluster with --cloud-provider=external #34861

@hexchen

Description

@hexchen

Is there an existing issue for this?

  • I have searched the existing issues

Version

equal or higher than v1.16.0 and lower than v1.17.0

What happened?

When a node is initially joined into the cluster with --cloud-provider=external set on the kubelet, it does not get an address set on the node object. Normally, the relevant CCM/CPI would then set an IP address on the node, afterwards the cilium-agent would start on the node and the cluster would bootstrap as normally.

In our environment, there is a bit of a chicken and egg problem. vsphere-cpi depends on cilium to work, since the controlplane runs outside of the cluster and it relies on the kubernetes service in the default namespace.

Previous to 2fc52eb (introduced in #28953) this would have been fine, since cilium did not require a node to have an IPv6 address set to start. With that commit, it became a hard error.

I personally do not fully understand the motivation behind making it a hard error, and I don't understand why IPv6 is being special-cased (there is no equivalent check requiring an IPv4 address to be present when IPv4 is enabled). It seems to me like it tries to make behaviour more consistent, but it actually caused a regression in our environment.

How can we reproduce the issue?

There are no easy reproduction steps, but we have managed to identify the root cause and have mentioned it in the main issue description.

A bit of background on our environment: we are running a Kubernetes cluster on VSphere, spawned with cluster-api, using Kamaji to spawn a vanilla controlplane.

Cilium Version

v1.16.1

We have reverted the configuration in our cluster after finding the cause.

Kernel Version

Linux [snip] 6.6.39 #1-NixOS SMP PREEMPT_DYNAMIC Thu Jul 11 10:49:22 UTC 2024 x86_64 GNU/Linux

Kubernetes Version

Server Version: v1.30.2

Regression

Regression introduced in 2fc52eb

Sysdump

No response

Relevant log output

No response

Anything else?

No response

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.feature/ipv6Relates to IPv6 protocol supportfeature/ipv6-onlyRelates to single-stack IPv6 support.kind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.kind/regressionThis functionality worked fine before, but was broken in a newer release of Cilium.staleThe stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions