Skip to content

feature request: ability to automatically pause recurrent failing jobs #303

@markstokan

Description

@markstokan

The Toolsmiths team ran into an issue today where Docker Hub began returning "500 Internal Server Errors" to requests for a docker image in a Task. We run some of our pipelines so often that we were creating containers faster than concourse could tear them down and eventually we hit our 250 container limit on two workers.

We understand that Concourse keeps "failed" containers around for post-mortem purposes but this ended up biting us. We talked about this with Evan and we determined that the most helpful feature in this scenario is automatically pausing the job after a certain number of consecutive failures.

Thanks.

Mark & Topher

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions