Skip to content

Worker container crashing all the time: baggageclaim exited with error: Exit trace for group: api exited with error: listen tcp 127.0.0.1:7788: bind: address already in use #1969

@francoisruty

Description

@francoisruty

Hi there!

Concourse version: 3.7.0
Deployment method: docker-compose

Every day, all our builds are frozen, stuck in "pending" state forever, because of check-resources step.
If I try to check resources via CLI, I get an internal server error 500 message.
If I log into the worker container and do "concourse worker", I get this:

tcp 127.0.0.1:7777: bind: address already in use"}}
Exit trace for group:
baggageclaim exited with error: Exit trace for group:
api exited with error: listen tcp 127.0.0.1:7788: bind: address already in use

Sometimes I get an additional error message that :53 is already in use too.

To work around the issue, here is my procedure:
stop the containers (docker-compose stop)
rm the containers (docker-compose rm)
rm the worker container volumes (docker volume prune)
restart the containers (docker-compose up -d)
then I do a check-resource via the CLI, which does not work because it stalls the worker
then I do a fly prune-worker
I restart the containers (docker-compose restart)
then I do again a check-resource via CLI, it works, and the builds can start

...aaaand the issue reappears a few hours later.
It has been an nightmare so far, and all our production is messed up because of this. I can't even keep debugging further as I have no clue about what the concourse binary is doing inside the containers, why it's having issues binding to ports.

Any advice is welcome

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions