Skip to content

Restoring rancher from a backup on a migrated rancher server fails with the error unable to create new content in namespace cluster-fleet-default-anupamapostbkp because it is being terminated error restoring #34518

@anupama2501

Description

@anupama2501

Rancher Server Setup

  • Rancher version: v2.6.0-rc8
  • Installation option (Docker install/Helm Chart): Helm
    • If Helm Chart, Kubernetes Info: RKE1
      Cluster Type (RKE1, RKE2, k3s, EKS, etc):
      Node Setup: 3 nodes all roles
      Version: v1.21.4
  • Proxy/Cert Details: Self-signed

Information about the Cluster

  • Kubernetes version:
  • Cluster Type (Local/Downstream): Downstream node driver rke1 and rke2 clusters
    • If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider):

Describe the bug
Restoring rancher server from the backup that is created on a rancher server that is migrated to a new rke cluster.

To Reproduce

  • On rancher server created an rke1 and rke2 node driver clusters [3 worker, 1 etcd, 1 cp ] in each cluster
  • Installed backup charts in the local cluster. version 2.0.0+up2.0.0-rc11
  • from continuous delivery, created a git repo - fleet1, url https://github.com/rancher/fleet-examples , path: multi-cluster/helm
  • Took a backup of the rancher server - bkp1
  • Created another git repo - fleet2 [url: https://github.com/rancher/fleet-examples , path: single-cluster/helm
  • Created another backup - bkp2 [will reuse after migration]
  • Created a new rke1 cluster and installed the rancher backup charts version 2.0.0+up2.0.0-rc11.
  • Restored the backup on the new rke cluster and migrated rancher to the new rke cluster. ref:https://rancher.com/docs/rancher/v2.x/en/backups/v2.5/migrating-rancher/
  • All clusters come up fine. verify no errors in fleet logs in the downstream clusters and the clusters in continuous delivery are all up and active.
  • Create a new rke2 cluster on this new rancher server.
  • Take a backup bkp3 once the cluster is up.
  • Perform restore from the backup - bkp2 [from the steps above]
  • All clusters come up fine. verify no errors in fleet logs in the downstream clusters and the clusters in continuous delivery are all up and active.
  • Perform another restore with bkp3.

Result

  • restore fails with the errors seen in the logs continuously in the backup resource rancher-backup-xxx.

Screen Shot 2021-08-27 at 2 13 10 PM

- rancher server retries multiple times, after sometime the rancher server comes up and the cluster created after migration in the new rke cluster is stuck in `Wait Check-In` state on the cluster.

Screen Shot 2021-08-27 at 2 11 49 PM

Expected Result
Restore should be successful and no errors should be seen. Downstream clusters should come up active.

Additional Info:

Issue is seen for both rke1 and rke2 downstream clusters:

Screen Shot 2021-08-27 at 4 07 15 PM

Related: #33954

SURE-5358

Metadata

Metadata

Labels

QA/Lfeature/charts-backup-restorekind/bug-qaIssues that have not yet hit a real release. Bugs introduced by a new feature or enhancementkind/featureIssues that represent larger new pieces of functionality, not enhancements to existing functionalitypriority/1release-noteNote this issue in the milestone's release notesstatus/release-blockerteam/area3

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions