-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Closed
Labels
ci/flakeThis is a known failure that occurs in the tree. Please investigate me!This is a known failure that occurs in the tree. Please investigate me!staleThe stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.
Description
Test Name
K8sChaosTest Restart with long lived connections TCP connection is not dropped when cilium restarts
Failure Output
FAIL: Failed while cilium was restarting
Stacktrace
Click to show.
/home/jenkins/workspace/Cilium-PR-K8s-1.22-kernel-4.19/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:527
Failed while cilium was restarting
Expected command: kubectl exec -n default netperf-client -- netperf -l 60 -t TCP_STREAM -H 10.0.0.114
To succeed, but it failed:
Exitcode: 1
Err: exit status 1
Stdout:
Stderr:
error: unable to upgrade connection: pod does not exist
/home/jenkins/workspace/Cilium-PR-K8s-1.22-kernel-4.19/src/github.com/cilium/cilium/test/k8s/chaos.go:233
Standard Output
Click to show.
Number of "context deadline exceeded" in logs: 0
Number of "level=error" in logs: 0
Number of "level=warning" in logs: 0
Number of "Cilium API handler panicked" in logs: 0
Number of "Goroutine took lock for more than" in logs: 0
No errors/warnings found in logs
Number of "context deadline exceeded" in logs: 0
Number of "level=error" in logs: 0
Number of "level=warning" in logs: 0
Number of "Cilium API handler panicked" in logs: 0
Number of "Goroutine took lock for more than" in logs: 0
No errors/warnings found in logs
Number of "context deadline exceeded" in logs: 0
Number of "level=error" in logs: 0
Number of "level=warning" in logs: 2
Number of "Cilium API handler panicked" in logs: 0
Number of "Goroutine took lock for more than" in logs: 0
Top 1 errors/warnings:
Session affinity for host reachable services needs kernel 5.7.0 or newer to work properly when accessed from inside cluster: the same service endpoint will be selected from all network namespaces on the host.
Cilium pods: [cilium-cpw82 cilium-hx64c]
Netpols loaded:
CiliumNetworkPolicies loaded:
Endpoint Policy Enforcement:
Pod Ingress Egress
Cilium agent 'cilium-cpw82': Status: Ok Health: Nodes "" ContinerRuntime: Kubernetes: Ok KVstore: Ok Controllers: Total 27 Failed 0
Cilium agent 'cilium-hx64c': Status: Ok Health: Nodes "" ContinerRuntime: Kubernetes: Ok KVstore: Ok Controllers: Total 29 Failed 0
Standard Error
Click to show.
16:38:57 STEP: Running BeforeAll block for EntireTestsuite K8sChaosTest Restart with long lived connections
16:38:57 STEP: WaitforPods(namespace="default", filter="-l zgroup=testapp")
16:39:01 STEP: WaitforPods(namespace="default", filter="-l zgroup=testapp") => <nil>
16:39:01 STEP: Deleting all cilium pods
16:39:02 STEP: Waiting cilium pods to terminate
16:39:02 STEP: Waiting for cilium pods to be ready
16:39:02 STEP: WaitforPods(namespace="kube-system", filter="-l k8s-app=cilium")
16:39:22 STEP: WaitforPods(namespace="kube-system", filter="-l k8s-app=cilium") => <nil>
16:39:34 STEP: Stopping netperf client test
FAIL: Failed while cilium was restarting
Expected command: kubectl exec -n default netperf-client -- netperf -l 60 -t TCP_STREAM -H 10.0.0.114
To succeed, but it failed:
Exitcode: 1
Err: exit status 1
Stdout:
Stderr:
error: unable to upgrade connection: pod does not exist
=== Test Finished at 2022-03-31T16:39:34Z====
16:39:34 STEP: Running JustAfterEach block for EntireTestsuite K8sChaosTest
===================== TEST FAILED =====================
16:39:35 STEP: Running AfterFailed block for EntireTestsuite K8sChaosTest
cmd: kubectl get pods -o wide --all-namespaces
Exitcode: 0
Stdout:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cilium-monitoring grafana-5747bcc8f9-9zr6n 0/1 Running 0 31m 10.0.0.133 k8s1 <none> <none>
cilium-monitoring prometheus-655fb888d7-4sc2j 1/1 Running 0 31m 10.0.0.33 k8s1 <none> <none>
default netperf-client 1/1 Running 0 38s 10.0.0.178 k8s2 <none> <none>
default netperf-server 1/1 Running 0 38s 10.0.0.114 k8s2 <none> <none>
kube-system cilium-cpw82 1/1 Running 0 33s 192.168.56.11 k8s1 <none> <none>
kube-system cilium-hx64c 1/1 Running 0 33s 192.168.56.12 k8s2 <none> <none>
kube-system cilium-operator-59b49c98df-f7bb9 1/1 Running 0 104s 192.168.56.12 k8s2 <none> <none>
kube-system cilium-operator-59b49c98df-jthc7 1/1 Running 0 104s 192.168.56.11 k8s1 <none> <none>
kube-system coredns-69b675786c-vzpm9 1/1 Running 0 3m36s 10.0.1.64 k8s1 <none> <none>
kube-system etcd-k8s1 1/1 Running 0 34m 192.168.56.11 k8s1 <none> <none>
kube-system kube-apiserver-k8s1 1/1 Running 0 34m 192.168.56.11 k8s1 <none> <none>
kube-system kube-controller-manager-k8s1 1/1 Running 0 34m 192.168.56.11 k8s1 <none> <none>
kube-system kube-proxy-2pdsp 1/1 Running 0 32m 192.168.56.12 k8s2 <none> <none>
kube-system kube-proxy-v68hc 1/1 Running 0 34m 192.168.56.11 k8s1 <none> <none>
kube-system kube-scheduler-k8s1 1/1 Running 0 34m 192.168.56.11 k8s1 <none> <none>
kube-system log-gatherer-fdnqx 1/1 Running 0 31m 192.168.56.11 k8s1 <none> <none>
kube-system log-gatherer-qsrjs 1/1 Running 0 31m 192.168.56.12 k8s2 <none> <none>
kube-system registry-adder-cq86j 1/1 Running 0 32m 192.168.56.12 k8s2 <none> <none>
kube-system registry-adder-tppbs 1/1 Running 0 32m 192.168.56.11 k8s1 <none> <none>
Stderr:
Fetching command output from pods [cilium-cpw82 cilium-hx64c]
cmd: kubectl exec -n kube-system cilium-cpw82 -c cilium-agent -- cilium service list
Exitcode: 0
Stdout:
ID Frontend Service Type Backend
1 10.96.0.1:443 ClusterIP 1 => 192.168.56.11:6443
2 10.97.199.142:3000 ClusterIP
3 10.96.0.10:53 ClusterIP 1 => 10.0.1.64:53
4 10.96.0.10:9153 ClusterIP 1 => 10.0.1.64:9153
5 10.104.143.48:9090 ClusterIP 1 => 10.0.0.33:9090
99 10.110.199.18:12865 ClusterIP 1 => 10.0.0.114:12865
Stderr:
cmd: kubectl exec -n kube-system cilium-cpw82 -c cilium-agent -- cilium endpoint list
Exitcode: 0
Stdout:
ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS
ENFORCEMENT ENFORCEMENT
201 Disabled Disabled 1 k8s:cilium.io/ci-node=k8s1 ready
k8s:node-role.kubernetes.io/control-plane
k8s:node-role.kubernetes.io/master
k8s:node.kubernetes.io/exclude-from-external-load-balancers
reserved:host
1254 Disabled Disabled 26044 k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=kube-system fd02::1b3 10.0.1.64 ready
k8s:io.cilium.k8s.policy.cluster=default
k8s:io.cilium.k8s.policy.serviceaccount=coredns
k8s:io.kubernetes.pod.namespace=kube-system
k8s:k8s-app=kube-dns
3025 Disabled Disabled 4 reserved:health fd02::18d 10.0.1.141 ready
Stderr:
cmd: kubectl exec -n kube-system cilium-hx64c -c cilium-agent -- cilium service list
Exitcode: 0
Stdout:
ID Frontend Service Type Backend
1 10.97.199.142:3000 ClusterIP
2 10.96.0.10:53 ClusterIP 1 => 10.0.1.64:53
3 10.96.0.10:9153 ClusterIP 1 => 10.0.1.64:9153
4 10.104.143.48:9090 ClusterIP 1 => 10.0.0.33:9090
5 10.96.0.1:443 ClusterIP 1 => 192.168.56.11:6443
113 10.110.199.18:12865 ClusterIP 1 => 10.0.0.114:12865
Stderr:
cmd: kubectl exec -n kube-system cilium-hx64c -c cilium-agent -- cilium endpoint list
Exitcode: 0
Stdout:
ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS
ENFORCEMENT ENFORCEMENT
77 Disabled Disabled 24018 k8s:id=netperf-client fd02::5a 10.0.0.178 ready
k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default
k8s:io.cilium.k8s.policy.cluster=default
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:zgroup=testapp
929 Disabled Disabled 3660 k8s:id=netperf-server fd02::95 10.0.0.114 ready
k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default
k8s:io.cilium.k8s.policy.cluster=default
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:zgroup=testapp
985 Disabled Disabled 4 reserved:health fd02::79 10.0.0.174 ready
2938 Disabled Disabled 1 k8s:cilium.io/ci-node=k8s2 ready
reserved:host
Stderr:
===================== Exiting AfterFailed =====================
16:39:46 STEP: Running AfterEach for block EntireTestsuite K8sChaosTest Restart with long lived connections
16:39:46 STEP: Running AfterEach for block EntireTestsuite
[[ATTACHMENT|a7642cde_K8sChaosTest_Restart_with_long_lived_connections_TCP_connection_is_not_dropped_when_cilium_restarts.zip]]
ZIP Links:
Click to show.
https://jenkins.cilium.io/job/Cilium-PR-K8s-1.22-kernel-4.19//909/artifact/a7642cde_K8sChaosTest_Restart_with_long_lived_connections_TCP_connection_is_not_dropped_when_cilium_restarts.zip
https://jenkins.cilium.io/job/Cilium-PR-K8s-1.22-kernel-4.19//909/artifact/test_results_Cilium-PR-K8s-1.22-kernel-4.19_909_BDD-Test-PR.zip
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.22-kernel-4.19/909/
If this is a duplicate of an existing flake, comment 'Duplicate of #<issue-number>' and close this issue.
Metadata
Metadata
Assignees
Labels
ci/flakeThis is a known failure that occurs in the tree. Please investigate me!This is a known failure that occurs in the tree. Please investigate me!staleThe stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.