-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Closed
Labels
Description
What happened:
Recently, we have seen CoreDNS pods panic with no probe failures. The pod logs show a panic and error yet there are no liveness nor readiness probe failures. As a result, the pod isn't working and isn't restarted thus leading to DNS failures within the cluster.
Pod liveness probe:
livenessProbe:
failureThreshold: 5
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
Pod readiness probe:
readinessProbe:
failureThreshold: 3
httpGet:
path: /ready
port: 8181
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
What you expected to happen:
CoreDNS liveness probe detects that the CoreDNS pod has failed and restarts it.
How to reproduce it (as minimally and precisely as possible):
We don't have an easy recreate scenario. The problem may be triggered by master network connectivity flakes.
Anything else we need to know?:
No.
Environment:
- the version of CoreDNS: 1.6.7 and 1.6.9
- Corefile: See below
- logs, if applicable: Not available
- OS (e.g:
cat /etc/os-release
): Ubuntu 18.04.4 LTS - Others: Failures seen on IBM Cloud Kubernetes Service clusters running Kubernetes version 1.16 and 1.17.
.:53 {
errors
health {
lameduck 10s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
pieterlange