Skip to content

Cilium Operator does not retrieve lock if kube-apiserver is down #13185

@aanm

Description

@aanm

I'm not sure if this only happens in the development machine.

Steps to reproduce the issue:

  1. start Cilium & Cilium-operator in the dev VM.
  2. sudo service kube-apiserver stop
  3. wait ~10 seconds
  4. sudo service kube-apiserver restart
  5. journalctl -fu cilium-operator

The operator never "restarted" its operations.

Sep 15 07:50:37 k8s1 cilium-operator[5814]: level=debug msg="Controller func execution time: 218.122µs" name=restart-unmanaged-kube-dns subsys=controller uuid=d87cab03-f6a4-11ea-8208-080027b966a1
Sep 15 07:50:43 k8s1 cilium-operator[5814]: error retrieving resource lock default/cilium-operator-resource-lock: Get "https://192.168.33.11:6443/apis/coordination.k8s.io/v1/namespaces/default/leases/cilium-operator-resource-lock": context deadline exceeded
Sep 15 07:50:43 k8s1 cilium-operator[5814]: level=error msg="error retrieving resource lock default/cilium-operator-resource-lock: Get \"https://192.168.33.11:6443/apis/coordination.k8s.io/v1/namespaces/default/leases/cilium-operator-resource-lock\": context deadline exceeded" subsys=klog
Sep 15 07:50:43 k8s1 cilium-operator[5814]: level=warning msg="error retrieving resource lock default/cilium-operator-resource-lock: Get \"https://192.168.33.11:6443/apis/coordination.k8s.io/v1/namespaces/default/leases/cilium-operator-resource-lock\": context deadline exceeded" subsys=klog
Sep 15 07:50:43 k8s1 cilium-operator[5814]: level=info msg="error retrieving resource lock default/cilium-operator-resource-lock: Get \"https://192.168.33.11:6443/apis/coordination.k8s.io/v1/namespaces/default/leases/cilium-operator-resource-lock\": context deadline exceeded" subsys=klog
Sep 15 07:50:43 k8s1 cilium-operator[5814]: level=info msg="failed to renew lease default/cilium-operator-resource-lock: timed out waiting for the condition" subsys=klog
Sep 15 07:50:43 k8s1 cilium-operator[5814]: Failed to release lock: resource name may not be empty
Sep 15 07:50:43 k8s1 cilium-operator[5814]: level=error msg="Failed to release lock: resource name may not be empty" subsys=klog
Sep 15 07:50:43 k8s1 cilium-operator[5814]: level=warning msg="Failed to release lock: resource name may not be empty" subsys=klog
Sep 15 07:50:43 k8s1 cilium-operator[5814]: level=info msg="Failed to release lock: resource name may not be empty" subsys=klog
Sep 15 07:50:43 k8s1 cilium-operator[5814]: level=info msg="Leader election lost" operator-id=k8s1-sJIQbbPYPS subsys=cilium-operator

Metadata

Metadata

Assignees

Labels

kind/bugThis is a bug in the Cilium logic.needs/triageThis issue requires triaging to establish severity and next steps.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions