Skip to content

[BUG] RKE1 node driver clusters don't work with external AWS cloud provider #43175

@kinarashah

Description

@kinarashah

Issue:
One of the prerequisites of external (out-of-tree) aws cloud provider is to follow node naming conventions.

When IP-based naming is used, the nodes must be named after the instance followed by the regional domain name (ip-xxx-xxx-xxx-xxx.ec2.<region>.internal). https://cloud-provider-aws.sigs.k8s.io/prerequisites/

This can be managed by setting hostname-override to kubelet and kube-proxy for custom clusters via --node-name. But for node driver clusters, Rancher assigns hostname based on the node name generated with prefix from node pool names. This results into cloud controller manager unable to find the node by name: Error getting instance metadata for node addresses: error fetching node by provider ID:

Fix:

  • Introduce a new cloud provider field name external-aws
  • When cloud provider is set to external-aws and useInstanceMetadataHostname is enabled, rke-tools will fetch the hostname by querying ec2 metadata service http://169.254.169.254/latest/meta-data/hostname. rke-tools will use hostname -f if metadata service returns empty. This name will be set as hostname-override to kubelet and kube-proxy.
  • useInstanceMetadataHostname is disabled by default. It'll need to be enabled by users to get the above mentioned behavior.

Metadata

Metadata

Labels

area/awsarea/provisioning-rke1Provisioning issues with RKE1area/rkekind/bugIssues that are defects reported by users or that we know have reached a real releaserelease-noteNote this issue in the milestone's release notesstatus/release-blockerstatus/release-note-addedteam/hostbustersThe team that is responsible for provisioning/managing downstream clusters + K8s version support

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions