Skip to content

[BUG] Worker nodes reconcile when a control plane node is deleted #39021

@sowmyav27

Description

@sowmyav27

Rancher Server Setup

  • Rancher version: 2.6-head commit id: 4206f57
  • Installation option (Docker install/Helm Chart): Helm
    • If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc): EKS v1.23.7-eks-4721010

Information about the Cluster

  • Kubernetes version: v1.24.4+k3s1 Downstream - AWS Node Driver cluster - 1 etcd, 2 cp, 3 worker nodes

User Information

  • What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom): admin

Describe the bug
[BUG] Worker nodes reconcile when a control plane node is deleted

To Reproduce

  • Deploy a k3s Node Driver cluster
  • Edit cluster, change the Upgrade Strategy to - Worker Concurrency: 2 and enable Drain Nodes for both Worker and control plane nodes
  • Save changes made
  • Cluster goes into full reconcile, an issue logged for it is here
  • Wait for the cluster to come up Active.
  • Delete a control plane node --> Select the ellipsis, clicking on Delete
  • Node is deleted, and another node is provisioned
  • But the worker nodes also reconcile

Result

  • Rancher prov logs


[INFO ] draining bootstrap node(s) sowmya-k3s-iam-pool1-595d8854cf-2gq4n: draining node
--
7:09:06 pm | [INFO ] configuring bootstrap node(s) sowmya-k3s-iam-pool1-595d8854cf-2gq4n: waiting for plan to be applied
7:09:28 pm | [INFO ] uncordoning bootstrap node(s) sowmya-k3s-iam-pool1-595d8854cf-2gq4n: error undraining machine sowmya-k3s-iam-pool1-595d8854cf-2gq4n: an error on the server ("unable to create impersonator account: error getting service account token: error getting secret: apiserver not ready") has prevented the request from succeeding (get nodes sowmya-k3s-iam-pool1-d43a7eea-28n59)
7:09:34 pm | [INFO ] configuring control plane node(s) sowmya-k3s-iam-pool2-6b68c77d5c-99m2s,sowmya-k3s-iam-pool2-6b68c77d5c-fhgcl
7:09:40 pm | [INFO ] draining control plane node(s) sowmya-k3s-iam-pool2-6b68c77d5c-99m2s: draining node
7:09:44 pm | [INFO ] configuring control plane node(s) sowmya-k3s-iam-pool2-6b68c77d5c-99m2s,sowmya-k3s-iam-pool2-6b68c77d5c-fhgcl
7:09:54 pm | [INFO ] uncordoning control plane node(s) sowmya-k3s-iam-pool2-6b68c77d5c-99m2s: waiting for uncordon to finish
7:09:56 pm | [INFO ] draining control plane node(s) sowmya-k3s-iam-pool2-6b68c77d5c-fhgcl: draining node
7:10:10 pm | [INFO ] configuring control plane node(s) sowmya-k3s-iam-pool2-6b68c77d5c-99m2s,sowmya-k3s-iam-pool2-6b68c77d5c-fhgcl
7:10:22 pm | [INFO ] uncordoning control plane node(s) sowmya-k3s-iam-pool2-6b68c77d5c-fhgcl: waiting for uncordon to finish
7:10:24 pm | [INFO ] sowmya-k3s-iam-pool2-6b68c77d5c-99m2s,sowmya-k3s-iam-pool2-6b68c77d5c-fhgcl
7:10:26 pm | [INFO ] draining worker node(s) sowmya-k3s-iam-pool3-c6fd664d7-9lz47,sowmya-k3s-iam-pool3-c6fd664d7-blkr4
7:10:28 pm | [INFO ] draining worker node(s) sowmya-k3s-iam-pool3-c6fd664d7-blkr4: draining node
7:10:34 pm | [INFO ] uncordoning worker node(s) sowmya-k3s-iam-pool3-c6fd664d7-9lz47: Node condition Ready is False., waiting for uncordon to finish
7:10:36 pm | [INFO ] draining worker node(s) sowmya-k3s-iam-pool3-c6fd664d7-blkr4,sowmya-k3s-iam-pool3-c6fd664d7-lchhr
7:11:06 pm | [INFO ] draining worker node(s) sowmya-k3s-iam-pool3-c6fd664d7-blkr4: draining node
7:11:10 pm | [INFO ] configuring worker node(s) sowmya-k3s-iam-pool3-c6fd664d7-blkr4: drain completed
7:11:12 pm | [INFO ] configuring worker node(s) sowmya-k3s-iam-pool3-c6fd664d7-blkr4,sowmya-k3s-iam-pool3-c6fd664d7-lchhr
7:11:14 pm | [INFO ] configuring worker node(s) sowmya-k3s-iam-pool3-c6fd664d7-blkr4: waiting for plan to be applied
7:11:16 pm | [INFO ] configuring worker node(s) sowmya-k3s-iam-pool3-c6fd664d7-blkr4: Node condition Ready is False., waiting for plan to be applied
7:11:18 pm | [INFO ] non-ready worker machine(s) sowmya-k3s-iam-pool3-c6fd664d7-blkr4: Node condition Ready is False.


Expected Result

  • Worker nodes should not reconcile/upgrade/drain
  • In RKE2 cluster, this doesn't happen. Worker nodes do not reconcile/drain/upgrade

SURE-6248
SURE-5826 (now closed)

Metadata

Metadata

Labels

QA/Sarea/capr/rke2RKE2 Provisioning issues involving CAPRarea/k3sarea/provisioning-v2Provisioning issues that are specific to the provisioningv2 generating frameworkarea/rke2RKE2-related Issuesinternalkind/bugIssues that are defects reported by users or that we know have reached a real releasepriority/2release-noteNote this issue in the milestone's release notesstatus/need-design-reviewstatus/release-note-addedteam/hostbustersThe team that is responsible for provisioning/managing downstream clusters + K8s version support

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions