Skip to content

Conversation

justinsb
Copy link
Member

@justinsb justinsb commented Nov 11, 2024

IPv6 brings some new complexities, particularly around IPAM.

We create a test and then fix a few things:

  • We need to assign the podCIDR for IPv6, so we add support to kops-controller. The source of this information is the host CRD.
  • Because we are assigning the podCIDR from the Host CRD, we need Host records for the control plane nodes. However, there are bootstrapping problems around creating a CRD during enrollment of the control-plane nodes. So instead, we can now generate a Host Object in yaml, and can apply it separately. A high-security workflow would probably create the host records separately anyway, because they are how we validate nodes.
  • Previously we were always setting the kubelet cloud-provider=external flag. But this assumes we are running a CCM. If we are not running a CCM (like metal), then we should not set the flag. If we do set the flag, kubelet sets the node.kops.k8s.io/uninitialized taint for CCM to clear, and nobody clears it.
  • We need to make sure there is an IPv6 default route so that kubelet can discover its node ip correctly. We could put this into the Host CRD, but it does seem like most nodes will have a default route.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 11, 2024
@justinsb
Copy link
Member Author

I am trying to upload this and then I can rebasing as I/we fix each problem.

Current problem is from nodeup:

vm0 nodeup[703]: W1111 17:07:00.322041     703 main.go:133] got error running nodeup (will retry in 30s): error building loader: building *model.PrefixBuilder: kOps IPAM controller not supported on cloud "metal"

So we need to decide how the podCIDR is assigned!

@k8s-ci-robot k8s-ci-robot added area/api area/kops-controller area/nodeup size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 12, 2024
@justinsb justinsb force-pushed the bare-metal-ipv6 branch 3 times, most recently from 1204c0d to 23634fb Compare November 12, 2024 18:35
@justinsb justinsb force-pushed the bare-metal-ipv6 branch 2 times, most recently from 2e6784b to 4ff49a7 Compare November 17, 2024 12:20
@justinsb
Copy link
Member Author

/retest

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 20, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 9, 2025
@justinsb justinsb force-pushed the bare-metal-ipv6 branch 4 times, most recently from 52749c9 to 39d7698 Compare February 10, 2025 23:05
@justinsb justinsb force-pushed the bare-metal-ipv6 branch 2 times, most recently from 5759774 to 29997cc Compare February 19, 2025 12:02
@justinsb justinsb changed the title WIP: tests: add test for bare-metal with ipv6 (Experimental) bare-metal with IPv6 Feb 19, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 19, 2025
@hakman hakman self-requested a review July 16, 2025 13:21
@justinsb justinsb force-pushed the bare-metal-ipv6 branch 4 times, most recently from 2792a42 to a84f931 Compare July 26, 2025 16:58
@justinsb
Copy link
Member Author

cc @hakman

I think this is now uncontroversial (I hope). We assign podCIDRs to nodes if they are configured on the Host object. If users don't want to do that, they just don't set podCIDRs on the Host object.

@hakman
Copy link
Member

hakman commented Jul 26, 2025

cc @hakman

I think this is now uncontroversial (I hope). We assign podCIDRs to nodes if they are configured on the Host object. If users don't want to do that, they just don't set podCIDRs on the Host object.

Cool, I will take a look soon. 🚀

justinsb added 5 commits July 26, 2025 20:01
IPv6 brings some new complexities, particularly around IPAM.
While we do require CCM for IPv6, we should configure the appropriate CCM.
This is needeed for bootstrapping the control plane,
because it's a CRD so can't be registered until the control plane is running.

It's also quite nice because we might want to review the contents of the
host CRD, e.g. to verify the key out-of-band.
@justinsb
Copy link
Member Author

/retest

@hakman
Copy link
Member

hakman commented Jul 27, 2025

/test all

@kubernetes kubernetes deleted a comment from k8s-ci-robot Jul 27, 2025
Copy link
Member

@hakman hakman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just a small nit.

@@ -36,6 +36,9 @@ type Host struct {
type HostSpec struct {
PublicKey string `json:"publicKey,omitempty"`
InstanceGroup string `json:"instanceGroup,omitempty"`

// PodCIDRs configures the IP ranges to be used for pods on this node/host.
PodCIDRs []string `json:"podCIDRs,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be added to the v1apha3 API too? What about the main API?

type HostSpec struct {
PublicKey string `json:"publicKey,omitempty"`
InstanceGroup string `json:"instanceGroup,omitempty"`
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great call. But I propose doing it as a separate PR, because I think we are also missing a round-trip test that would have caught this!

@hakman
Copy link
Member

hakman commented Jul 27, 2025

/hold in case you want to update the other APIs (could also be a separate PR).
/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 27, 2025
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 27, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 27, 2025
@hakman
Copy link
Member

hakman commented Jul 27, 2025

/test pull-kops-e2e-k8s-aws-amazonvpc

@hakman
Copy link
Member

hakman commented Jul 27, 2025

/test pull-kops-e2e-k8s-gce-cilium

@hakman
Copy link
Member

hakman commented Jul 27, 2025

/test pull-kops-e2e-k8s-aws-calico

@hakman
Copy link
Member

hakman commented Jul 27, 2025

/test pull-kops-e2e-k8s-aws-amazonvpc

@justinsb
Copy link
Member Author

/hold cancel

I propose adding a round-trip test alongside fixing the missing field in v1alpha3

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 28, 2025
@k8s-ci-robot k8s-ci-robot merged commit dad8167 into kubernetes:master Jul 28, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api area/documentation area/kops-controller area/nodeup cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants