Skip to content

stdout metric collector failed #1576

@chenwenjun-github

Description

@chenwenjun-github

/kind bug

What steps did you take and what happened:
image

I use tfjob as trial's job, my tfjob has one ps, one chief, one worker, and the metric collector is stdout, but I find that the metrics-logger-and-collector container sometimes will become error, and the print like above, this isn't must present.
But this will cause that the worker‘s metric can't be collected.

this error message in code like this:
image

can you give me some advice to aviod this problem?

What did you expect to happen:
don't appear this error

Environment:

  • katib version (kfctl version): v0.10.1
  • Kubernetes version: (use kubectl version): 1.13

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions