Skip to content

[Bug]: Upgrade from 5.0.0 to edge causes zone-sync to fail #7864

@vepatel

Description

@vepatel

Version

edge

What Kubernetes platforms are you running on?

GKE Google Cloud

Steps to reproduce

  1. Deploy NIC v5.0.0 with zone-sync: true in configmap.
  2. Check if the headless svc is created and there are no resolver related errors in NIC logs,
  3. Upgrade NIC to latest available edge build keeping the zone-sync: true in configmap
  4. Check the NIC logs and observe as the headless service fails to get resolved by dns resolver
2025/06/05 10:24:10 [error] 24#24: test-release-nginx-ingress-controller-replicaset-hl.default.svc.cluster.local could not be resolved (3: Host not found)
2025/06/05 10:24:16 [error] 24#24: test-release-nginx-ingress-controller-replicaset-hl.default.svc.cluster.local could not be resolved (3: Host not found)

Possible cause is difference in the selectorLabels of headless service and new replicaset as the service doesn't gets recreated, removing the service manually fixes the issue as the new one gets created by controller with updated labels (specifically pod-template-hash)

replicaset:

Selector:       app.kubernetes.io/instance=test-release,app.kubernetes.io/name=nginx-ingress,pod-template-hash=65db4857d8

svc:

Selector:                 app.kubernetes.io/instance=test-release,app.kubernetes.io/name=nginx-ingress,pod-template-hash=7df8445684

Metadata

Metadata

Assignees

Labels

backlogPull requests/issues that are backlog itemsbugAn issue reporting a potential bug

Type

Projects

Status

Done 🚀

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions