-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Open
Labels
area/CIContinuous Integration testing issue or flakeContinuous Integration testing issue or flakeci/flakeThis is a known failure that occurs in the tree. Please investigate me!This is a known failure that occurs in the tree. Please investigate me!
Description
CI failure
https://github.com/cilium/cilium/actions/runs/8093409030/job/22116011943
[.] Action [client-ingress-knp/client-to-client/ping-ipv6-5: cilium-test/client-69748f45d8-v7nx8 (fd00:10:244:3::e756) -> cilium-test/client3-868f7b8f6b-ld2bk (fd00:10:244:2::228d:0)]
❌ command "ping -c 1 -6 -W 2 -w 30 fd00:10:244:2::228d" succeeded while it should have failed: PING fd00:10:244:2::228d(fd00:10:244:2::228d) 56 data bytes
--- fd00:10:244:2::228d ping statistics ---
30 packets transmitted, 0 received, 100% packet loss, time 29712ms
Sysdump logs suggest that the k8s controlplane was non-responsive:
Error reading Cilium logs: error getting cilium-agent logs for kube-system/cilium-n2c9d: Get "https://0.0.0.0:6443/api/v1/namespaces/kube-system/pods/cilium-n2c9d/log?container=cilium-agent&sinceTime=2024-02-29T09%3A04%3A29Z×tamps=true": dial tcp 0.0.0.0:6443: connect: connection refused
🔍 Collecting sysdump with cilium-cli version: v0.15.23, args: [connectivity test --include-unsafe-tests --collect-sysdump-on-failure --sysdump-hubble-flows-count=1000000 --sysdump-hubble-flows-timeout=5m --sysdump-output-filename cilium-sysdump-13-<ts> --junit-file cilium-junits/Setup & Test (13).xml --junit-property github_job_step=Run tests (13) --request-timeout 30s]
ℹ️ Failed to detect Cilium installation
ℹ️ Failed to detect Cilium operator
🔍 Collecting Kubernetes nodes
❌ Failed to create sysdump collector: failed to collect Kubernetes nodes: Get "https://0.0.0.0:6443/api/v1/nodes": dial tcp 0.0.0.0:6443: connect: connection refused
🔍 Collecting sysdump with cilium-cli version: v0.15.23, args: [connectivity test --include-unsafe-tests --collect-sysdump-on-failure --sysdump-hubble-flows-count=1000000 --sysdump-hubble-flows-timeout=5m --sysdump-output-filename cilium-sysdump-13-<ts> --junit-file cilium-junits/Setup & Test (13).xml --junit-property github_job_step=Run tests (13) --request-timeout 30s]
[[[[ SNIP, many more resources which could not be collected ]]]]
This is somewhat related to #22162 but I've observed it on ci-e2e, not AKS or some other cloud prodvider. Additionally, the ping output is there in full, but the exit code was transferred incorrectly. Ping returns non-zero for 100% packet loss, but we got err == nil
.
The fix in cilium/cilium-cli#1857 is broken in that ping
for IPv6 doesn't match the output of ping for IPv4 - filed cilium/cilium-cli#2352
Metadata
Metadata
Assignees
Labels
area/CIContinuous Integration testing issue or flakeContinuous Integration testing issue or flakeci/flakeThis is a known failure that occurs in the tree. Please investigate me!This is a known failure that occurs in the tree. Please investigate me!