-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Closed
Ensure that the CLI doesn't flood the terminal with logs on failure during test setup/teardown#38182
Task
Copy link
Labels
area/CI-improvementTopic or proposal to improve the Continuous Integration workflowTopic or proposal to improve the Continuous Integration workflowarea/cliImpacts the command line interface of any command in the repository.Impacts the command line interface of any command in the repository.help-wantedPlease volunteer for this by adding yourself as an assignee!Please volunteer for this by adding yourself as an assignee!
Description
The following test encountered an error while doing some test setup/teardown operations:
https://github.com/cilium/cilium/actions/runs/13838016271/job/38718280531#step:37:545
The output looks like:
[.] Action [client-egress-l7-set-header/pod-to-pod-with-endpoints/curl-ipv6-2-auth-header-required: cilium-test-1/client3-795488bf5-v66d8 (fd00:10:244:2::af7e) -> curl-ipv6-2-auth-header-required (fd00:10:244:2::4467:8080)]
ℹ️ 📜 Deleting secret 'header-match' from namespace 'cilium-test-1'..
ℹ️ 📜 Deleting CiliumNetworkPolicy 'client-egress-l7-http-matchheader-secret' in namespace 'cilium-test-1' on cluster kind-kind..
ℹ️ Cilium agent kube-system/cilium-rlnmc logs since 2025-03-13 15:39:38.677825485 +0000 UTC m=+174.581938268:
Followed by over 30K lines of cilium-agent logs, followed by:
2025-03-13T16:03:27.7070182Z ❌ Error finalizing 'to-fqdns': timed out waiting for policy updates to be processed on Cilium agents: command failed (pod=kube-system/cilium-rlnmc, container=cilium-agent): command terminated with exit code 1: "Error: cannot list endpoints: Cilium API client timeout exceeded\n\n"
2025-03-13T16:04:00.5905603Z ⚠️ The following tasks failed, the sysdump may be incomplete:
2025-03-13T16:04:00.5907921Z ⚠️ [15] Collecting Cilium Egress Gateway policies: failed to collect Cilium Egress Gateway policies: the server could not find the requested resource (get ciliumegressgatewaypolicies.cilium.io)
2025-03-13T16:04:00.5910763Z ⚠️ [17] Collecting Cilium local redirect policies: failed to collect Cilium local redirect policies: the server could not find the requested resource (get ciliumlocalredirectpolicies.cilium.io)
2025-03-13T16:04:00.5913693Z ⚠️ [19] Collecting Cilium endpoint slices: failed to collect Cilium endpoint slices: the server could not find the requested resource (get ciliumendpointslices.cilium.io)
2025-03-13T16:04:00.5915967Z ⚠️ [34] Collecting the Hubble Relay configuration: failed to collect the Hubble Relay configuration: configmaps "hubble-relay-config" not found
2025-03-13T16:04:00.5917735Z ⚠️ [39] Collecting the Hubble cert-manager certificates: failed to collect certificates (v1): the server could not find the requested resource
2025-03-13T16:04:00.5919759Z ⚠️ [68] Collecting Tetragon PodInfo custom resources: failed to collect podinfo (v1alpha1): the server could not find the requested resource
2025-03-13T16:04:00.5921699Z ⚠️ [69] Collecting Tetragon tracing policies: failed to collect tracingpolicies (v1alpha1): the server could not find the requested resource
2025-03-13T16:04:00.5923643Z ⚠️ [70] Collecting Tetragon namespaced tracing policies: failed to collect tracingpoliciesnamespaced (v1alpha1): the server could not find the requested resource
2025-03-13T16:04:00.5925799Z ⚠️ [72] Collecting Helm metadata from the Tetragon release: failed to get the helm metadata from the release: unable to retrieve helm meta from release tetragon: release: not found
2025-03-13T16:04:00.5928064Z ⚠️ [74] Collecting Helm values from the Tetragon release: failed to get the helm values from the release: unable to retrieve helm value from release tetragon: release: not found
2025-03-13T16:04:00.5929548Z ⚠️ Please note that depending on your Cilium version and installation options, this may be expected
2025-03-13T16:04:00.5930181Z 🗳 Compiling sysdump
2025-03-13T16:04:03.0646708Z ✅ The sysdump has been saved to cilium-sysdump-2-20250313-160117.zip
2025-03-13T16:04:03.1739931Z 🟥 [cilium-test-2] test to-fqdns failed: setting up test: applying network policies: policies were not applied on all Cilium nodes in time: command failed (pod=kube-system/cilium-8d8zz, container=cilium-agent): command terminated with exit code 1: "Error: cannot list endpoints: Cilium API client timeout exceeded\n\n"
2025-03-13T16:04:03.1745393Z 🔍 Collecting sysdump with cilium-cli version: 8b282274, args: [connectivity test --log-code-owners --exclude-code-owners=@cilium/github-sec --test-concurrency=5 --test !seq-.* --include-unsafe-tests --collect-sysdump-on-failure --flush-ct --sysdump-hubble-flows-count=1000000 --sysdump-hubble-flows-timeout=5m --sysdump-output-filename cilium-sysdump-2-<ts> --junit-file cilium-junits/Setup & Test (2).xml --junit-property github_job_step=Run tests upgrade 2 (2)]
...
In this case, it would be better to filter the logs to errors/warnings and/or ~50 lines of context to avoid spamming the output. This would make the failure easier to read in the test workflow run. The following commit uses a similar strategy for cilium status
output: 7d20666
Metadata
Metadata
Assignees
Labels
area/CI-improvementTopic or proposal to improve the Continuous Integration workflowTopic or proposal to improve the Continuous Integration workflowarea/cliImpacts the command line interface of any command in the repository.Impacts the command line interface of any command in the repository.help-wantedPlease volunteer for this by adding yourself as an assignee!Please volunteer for this by adding yourself as an assignee!