-
Notifications
You must be signed in to change notification settings - Fork 2k
Closed
Labels
backlogPull requests/issues that are backlog itemsPull requests/issues that are backlog itemsbugAn issue reporting a potential bugAn issue reporting a potential bug
Description
Describe the bug
When the NGINX master process exists unexpectedly (e.g. the process is killed using kill -9 <master-process-pid>
), system files generated by NGINX are not cleaned up.
This bug outlines the impact of unix socket files in /var/lib/nginx
persisting after the NGINX master process exists unexpectedly.
Log output from NGINX when master process exists unexpectedly
E1102 09:38:53.243649 1 main.go:501] nginx command exited with an error: signal: killed
I1102 09:38:53.243740 1 main.go:511] Shutting down the controller
I1102 09:38:53.244035 1 main.go:521] Exiting with a status: 1
To Reproduce
Steps to reproduce the behavior:
- Deploy all the necessary prerequisites outlined in the installation with manifest docs.
- Deploy the below Deployment manifest which is configured with a
volume
or typeemptyDir:{}
andvolumeMount
for/var/lib/nginx
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-ingress
namespace: nginx-ingress
spec:
replicas: 1
selector:
matchLabels:
app: nginx-ingress
template:
metadata:
labels:
app: nginx-ingress
app.kubernetes.io/name: nginx-ingress
spec:
serviceAccountName: nginx-ingress
automountServiceAccountToken: true
securityContext:
seccompProfile:
type: RuntimeDefault
volumes:
- name: nginx-lib
emptyDir: {}
containers:
- image: nginx/nginx-ingress:3.3.1
imagePullPolicy: IfNotPresent
name: nginx-ingress
ports:
- name: http
containerPort: 80
- name: https
containerPort: 443
- name: readiness-port
containerPort: 8081
- name: prometheus
containerPort: 9113
readinessProbe:
httpGet:
path: /nginx-ready
port: readiness-port
periodSeconds: 1
resources:
requests:
cpu: "100m"
memory: "128Mi"
securityContext:
allowPrivilegeEscalation: false
runAsUser: 101 #nginx
runAsNonRoot: true
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
volumeMounts:
- mountPath: /var/lib/nginx
name: nginx-lib
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- -nginx-configmaps=$(POD_NAMESPACE)/nginx-config
- Attach a debug container to the running NGINX Ingress Controller pod using
kubectl debug -it -n <ic-namespace> <ic-pod> --image=busybox:1.28 --target=nginx-ingress
- Within the debug container, run
ps -ef
to get the process id of the NGINX master process - Stop the NGINX master process using
kill -9 <master-process-pid>
- View the logs of the NGINX Ingress Controller pod and see NGINX fail to bind to unix sockets.
Expected behavior
NGINX Ingress Controller is able to recover and operate normally after exiting unexpectedly.
Your environment
- Version of the Ingress Controller - 3.3.1
- Version of Kubernetes - 1.27.4
- Kubernetes platform - k3d
- Using NGINX or NGINX Plus - NGINX 1.25.2
Additional context
Full deployment manifest used:
Log output
NINX Ingress Controller Version=3.3.1 Commit=0f828bb5f4159d7fb52bcff0159d1ddd99f16f87 Date=2023-10-13T16:23:42Z DirtyState=false Arch=linux/arm64 Go=go1.21.3
I1102 09:38:54.316209 1 flags.go:297] Starting with flags: ["-nginx-configmaps=nginx-ingress/nginx-config"]
I1102 09:38:54.320330 1 main.go:236] Kubernetes version: 1.27.4
I1102 09:38:54.328891 1 main.go:382] Using nginx version: nginx/1.25.2
I1102 09:38:54.337340 1 main.go:782] Pod label updated: nginx-ingress-64f9fcdb96-dpgsk
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-config-version.sock failed (98: Address already in use)
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-502-server.sock failed (98: Address already in use)
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-418-server.sock failed (98: Address already in use)
2023/11/02 09:38:54 [notice] 16#16: try again to bind() after 500ms
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-config-version.sock failed (98: Address already in use)
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-502-server.sock failed (98: Address already in use)
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-418-server.sock failed (98: Address already in use)
2023/11/02 09:38:54 [notice] 16#16: try again to bind() after 500ms
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-config-version.sock failed (98: Address already in use)
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-502-server.sock failed (98: Address already in use)
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-418-server.sock failed (98: Address already in use)
2023/11/02 09:38:54 [notice] 16#16: try again to bind() after 500ms
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-config-version.sock failed (98: Address already in use)
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-502-server.sock failed (98: Address already in use)
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-418-server.sock failed (98: Address already in use)
2023/11/02 09:38:54 [notice] 16#16: try again to bind() after 500ms
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-config-version.sock failed (98: Address already in use)
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-502-server.sock failed (98: Address already in use)
2023/11/02 09:38:54 [emerg] 16#16: bind() to unix:/var/lib/nginx/nginx-418-server.sock failed (98: Address already in use)
2023/11/02 09:38:54 [notice] 16#16: try again to bind() after 500ms
2023/11/02 09:38:54 [emerg] 16#16: still could not bind()
F1102 09:39:54.341336 1 manager.go:288] Could not get newest config version: could not get expected version: 0 after 1m0s
vepatel and leanear-br
Metadata
Metadata
Assignees
Labels
backlogPull requests/issues that are backlog itemsPull requests/issues that are backlog itemsbugAn issue reporting a potential bugAn issue reporting a potential bug
Type
Projects
Status
Done 🚀