Skip to content

[BUG] Provisioning an RKE2 cluster with the data/dir feature enabled causes snapshot restores to fail #46066

@susesgartner

Description

@susesgartner

Rancher Server Setup

  • Rancher version: 2.9-head
  • Installation option: Docker

Information about the Cluster

  • Kubernetes version: v1.29.6+rke2r1
  • Cluster Type: Downstream RKE2 Node Driver
  • Infrastructure Provider: AWS

User Information

  • User: Admin

Describe the bug
When doing a etcd snapshot restore after upgrading the rke2 cluster k8s version the cluster hangs and all pods on the control plane remain in a pending state.

To Reproduce

  1. Create a downstream cluster with the split roles and the following data dir configuration
    -System-agent dir: /agent
    -Provisioning dir: /provisioning
    -K8s Distro: /k8s
  2. Take a snapshot
  3. Update the k8s version to 1.30
  4. Restore the snapshot

Result
The cluster hangs while attempting to restore

Expected Result
The etcd restore completes successfully.

Screenshots
image

Metadata

Metadata

Labels

QA/Skind/bugIssues that are defects reported by users or that we know have reached a real releasepriority/0release-noteNote this issue in the milestone's release notesstatus/release-blockerstatus/release-note-addedteam/hostbustersThe team that is responsible for provisioning/managing downstream clusters + K8s version support

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions