Skip to content

Sometimes when printing container logs ct will hang #471

@nhudson

Description

@nhudson

Is this a request for help?: No


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Version of Helm and Kubernetes:

helm version
version.BuildInfo{Version:"v3.9.1", GitCommit:"a7c043acb5ff905c261cfdc923a35776ba5e66e4", GitTreeState:"clean", GoVersion:"go1.17.5"}
kubectl version -o yaml
clientVersion:
  buildDate: "2022-05-03T13:46:05Z"
  compiler: gc
  gitCommit: 4ce5a8954017644c5420bae81d72b09b735c21f0
  gitTreeState: clean
  gitVersion: v1.24.0
  goVersion: go1.18.1
  major: "1"
  minor: "24"
  platform: darwin/amd64
kustomizeVersion: v4.5.4
serverVersion:
  buildDate: "2022-05-19T15:42:59Z"
  compiler: gc
  gitCommit: 4ce5a8954017644c5420bae81d72b09b735c21f0
  gitTreeState: clean
  gitVersion: v1.24.0
  goVersion: go1.18.1
  major: "1"
  minor: "24"
  platform: linux/arm64
 ct version
Version:	3.7.0
Git commit:	1d3feac8e5ca55ccf11521ff297ccee7f09aa749
Date:		2022-07-27
License:	Apache 2.0

What happened:

Sometimes when ct is printing logs of containers, ct will hang indefinitely and will need to be killed.

What you expected to happen:

ct to not hang when printing out container logs to stdout

How to reproduce it (as minimally and precisely as possible):

This is pretty easy to reproduce for me, though I don't know about others. The best examples I can show are some outputs from some Action runs we have.

We maintain a Helm chart that has many dependencies and most of the time the log output of our main chart (TimescaleDB) will hang ct. You can see a good example of this scenario with this job run (make e2e). This job ended up failing, not because the job failed but because the timeout of the run was exhausted at ~320 min. I can reproduce this locally and actually let it sit for a total of 2 days to see what would happen. ct never recovered and it was stuck in this state until I forced an exit of the process (CTL-C).

Here is an example of the log output of ct and it being successful with the same chart data: job (make e2e)

I have more examples, but the outcome really doesn't follow a pattern. A majority of the time ct will hang, and then sometimes will work fine if re-queued, sometimes not.

Anything else we need to know:

There is another report of this happening in this issue: #332 (comment)

I did apply the kubectl-timeout option that was added with #360, but it does not help with this issue for me.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions