-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Closed
Closed
Copy link
Labels
area/agentCilium agent related.Cilium agent related.area/proxyImpacts proxy components, including DNS, Kafka, Envoy and/or XDS servers.Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers.area/servicemeshGH issues or PRs regarding servicemeshGH issues or PRs regarding servicemeshkind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.needs/triageThis issue requires triaging to establish severity and next steps.This issue requires triaging to establish severity and next steps.
Description
Is there an existing issue for this?
- I have searched the existing issues
What happened?
Cilium envoy crashed with:
[2024-11-29 11:16:48.630][51][error][envoy_bug] [cilium/network_policy.cc:612] envoy bug failure: !Thread::MainThread::isMainOrTestThread()
[2024-11-29 11:16:48.630][51][error][envoy_bug] [external/envoy/source/common/common/assert.h:38] stacktrace for envoy bug
[2024-11-29 11:16:48.631][51][error][envoy_bug] [external/envoy/source/common/common/assert.h:45] #0 UNKNOWN [0x55c3bd9d986c]
[2024-11-29 11:16:48.631][51][error][envoy_bug] [external/envoy/source/common/common/assert.h:45] #1 UNKNOWN [0x55c3bd9d4639]
[2024-11-29 11:16:48.631][51][error][envoy_bug] [external/envoy/source/common/common/assert.h:45] #2 UNKNOWN [0x55c3bd9d454f]
[2024-11-29 11:16:48.632][51][error][envoy_bug] [external/envoy/source/common/common/assert.h:45] #3 UNKNOWN [0x55c3bec48eac]
[2024-11-29 11:16:48.632][51][error][envoy_bug] [external/envoy/source/common/common/assert.h:45] #4 UNKNOWN [0x55c3bf4d7e59]
[2024-11-29 11:16:48.632][51][error][envoy_bug] [external/envoy/source/common/common/assert.h:45] #5 UNKNOWN [0x55c3bf7a2589]
[2024-11-29 11:16:48.632][51][error][envoy_bug] [external/envoy/source/common/common/assert.h:45] #6 UNKNOWN [0x55c3bf7a1341]
[2024-11-29 11:16:48.632][51][error][envoy_bug] [external/envoy/source/common/common/assert.h:45] #7 UNKNOWN [0x55c3bec71909]
[2024-11-29 11:16:48.632][51][error][envoy_bug] [external/envoy/source/common/common/assert.h:45] #8 UNKNOWN [0x55c3bf82b40e]
[2024-11-29 11:16:48.633][51][error][envoy_bug] [external/envoy/source/common/common/assert.h:45] #9 UNKNOWN [0x7f4e53674ac3]
[2024-11-29 11:16:48.634][51][critical][backtrace] [external/envoy/source/server/backtrace.h:127] Caught Aborted, suspect faulting address 0xd
[2024-11-29 11:16:48.635][51][critical][backtrace] [external/envoy/source/server/backtrace.h:111] Backtrace (use tools/stack_decode.py to get line numbers):
[2024-11-29 11:16:48.635][51][critical][backtrace] [external/envoy/source/server/backtrace.h:112] Envoy version: f09ed995abccd4d360c769d256a781f1874c2f3b/1.31.3/Distribution/RELEASE/BoringSSL
[2024-11-29 11:16:48.635][51][critical][backtrace] [external/envoy/source/server/backtrace.h:114] Address mapping: 55c3bd920000-55c3bfde7000 /usr/bin/cilium-envoy
[2024-11-29 11:16:48.635][51][critical][backtrace] [external/envoy/source/server/backtrace.h:121] #0: [0x7f4e53622520]
How can we reproduce the issue?
I don't have clear reproduction steps at the moment. I've configured TLS interception (precise policies in sysdump), updated the certificate a few times, and issued a few curl requests. At a certain point one of the cilium envoy pods crashed with the above log. One relevant note is that kind-worker3
(the one hosting the crashing envoy proxy) previously hosted one client, which had already terminated at that point.
Cilium Version
Recent tip of main: v1.17.0-dev-0aeeefb4431
Cilium envoy: tip of main (f09ed995abccd4d360c769d256a781f1874c2f3b)
Sysdump
cilium-sysdump-20241129-121901.zip
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
area/agentCilium agent related.Cilium agent related.area/proxyImpacts proxy components, including DNS, Kafka, Envoy and/or XDS servers.Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers.area/servicemeshGH issues or PRs regarding servicemeshGH issues or PRs regarding servicemeshkind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.needs/triageThis issue requires triaging to establish severity and next steps.This issue requires triaging to establish severity and next steps.