Skip to content

Ensure that the CLI doesn't flood the terminal with logs on failure during test setup/teardown #38182

@joestringer

Description

@joestringer

The following test encountered an error while doing some test setup/teardown operations:

https://github.com/cilium/cilium/actions/runs/13838016271/job/38718280531#step:37:545

The output looks like:

  [.] Action [client-egress-l7-set-header/pod-to-pod-with-endpoints/curl-ipv6-2-auth-header-required: cilium-test-1/client3-795488bf5-v66d8 (fd00:10:244:2::af7e) -> curl-ipv6-2-auth-header-required (fd00:10:244:2::4467:8080)]
  ℹ️  📜 Deleting secret 'header-match' from namespace 'cilium-test-1'..
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-l7-http-matchheader-secret' in namespace 'cilium-test-1' on cluster kind-kind..
  ℹ️  Cilium agent kube-system/cilium-rlnmc logs since 2025-03-13 15:39:38.677825485 +0000 UTC m=+174.581938268:

Followed by over 30K lines of cilium-agent logs, followed by:

2025-03-13T16:03:27.7070182Z   ❌ Error finalizing 'to-fqdns': timed out waiting for policy updates to be processed on Cilium agents: command failed (pod=kube-system/cilium-rlnmc, container=cilium-agent): command terminated with exit code 1: "Error: cannot list endpoints: Cilium API client timeout exceeded\n\n"
2025-03-13T16:04:00.5905603Z ⚠️ The following tasks failed, the sysdump may be incomplete:
2025-03-13T16:04:00.5907921Z ⚠️ [15] Collecting Cilium Egress Gateway policies: failed to collect Cilium Egress Gateway policies: the server could not find the requested resource (get ciliumegressgatewaypolicies.cilium.io)
2025-03-13T16:04:00.5910763Z ⚠️ [17] Collecting Cilium local redirect policies: failed to collect Cilium local redirect policies: the server could not find the requested resource (get ciliumlocalredirectpolicies.cilium.io)
2025-03-13T16:04:00.5913693Z ⚠️ [19] Collecting Cilium endpoint slices: failed to collect Cilium endpoint slices: the server could not find the requested resource (get ciliumendpointslices.cilium.io)
2025-03-13T16:04:00.5915967Z ⚠️ [34] Collecting the Hubble Relay configuration: failed to collect the Hubble Relay configuration: configmaps "hubble-relay-config" not found
2025-03-13T16:04:00.5917735Z ⚠️ [39] Collecting the Hubble cert-manager certificates: failed to collect certificates (v1): the server could not find the requested resource
2025-03-13T16:04:00.5919759Z ⚠️ [68] Collecting Tetragon PodInfo custom resources: failed to collect podinfo (v1alpha1): the server could not find the requested resource
2025-03-13T16:04:00.5921699Z ⚠️ [69] Collecting Tetragon tracing policies: failed to collect tracingpolicies (v1alpha1): the server could not find the requested resource
2025-03-13T16:04:00.5923643Z ⚠️ [70] Collecting Tetragon namespaced tracing policies: failed to collect tracingpoliciesnamespaced (v1alpha1): the server could not find the requested resource
2025-03-13T16:04:00.5925799Z ⚠️ [72] Collecting Helm metadata from the Tetragon release: failed to get the helm metadata from the release: unable to retrieve helm meta from release tetragon: release: not found
2025-03-13T16:04:00.5928064Z ⚠️ [74] Collecting Helm values from the Tetragon release: failed to get the helm values from the release: unable to retrieve helm value from release tetragon: release: not found
2025-03-13T16:04:00.5929548Z ⚠️ Please note that depending on your Cilium version and installation options, this may be expected
2025-03-13T16:04:00.5930181Z 🗳 Compiling sysdump
2025-03-13T16:04:03.0646708Z ✅ The sysdump has been saved to cilium-sysdump-2-20250313-160117.zip
2025-03-13T16:04:03.1739931Z   🟥 [cilium-test-2] test to-fqdns failed: setting up test: applying network policies: policies were not applied on all Cilium nodes in time: command failed (pod=kube-system/cilium-8d8zz, container=cilium-agent): command terminated with exit code 1: "Error: cannot list endpoints: Cilium API client timeout exceeded\n\n"
2025-03-13T16:04:03.1745393Z 🔍 Collecting sysdump with cilium-cli version: 8b282274, args: [connectivity test --log-code-owners --exclude-code-owners=@cilium/github-sec --test-concurrency=5 --test !seq-.* --include-unsafe-tests --collect-sysdump-on-failure --flush-ct --sysdump-hubble-flows-count=1000000 --sysdump-hubble-flows-timeout=5m --sysdump-output-filename cilium-sysdump-2-<ts> --junit-file cilium-junits/Setup & Test (2).xml --junit-property github_job_step=Run tests upgrade 2 (2)]
...

In this case, it would be better to filter the logs to errors/warnings and/or ~50 lines of context to avoid spamming the output. This would make the failure easier to read in the test workflow run. The following commit uses a similar strategy for cilium status output: 7d20666

Metadata

Metadata

Labels

area/CI-improvementTopic or proposal to improve the Continuous Integration workflowarea/cliImpacts the command line interface of any command in the repository.help-wantedPlease volunteer for this by adding yourself as an assignee!

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions