Skip to content

Missing Depends_ON functionality during start process within docker-swarm #31333

@ozlatkov

Description

@ozlatkov

This is in regards to previous discussions which happened here in regards to Docker swarm mode within 1.13:

docker/compose#4305 (comment)

and here:

#30404 (comment)

and as per recommendation to move this in the docker/docker issue tracker.

As mentioned here I do understand the idea behind the "fault-tolerance" mechanism and that SWARM takes care to restart the container automatically.

However as explained in the first thread this is not quite the same concept when it comes to "initialization" where the "fault-taulerant" mechanism could (and actually does) make more issues than helping.
At a time when everything is completely stopped and while swarm starts the containers which have dependencies between each other it happen that every one get's restarted because the other one is still not running and so on, causing much more time lost during the startup instead of the opposite (faster boot).
The bigger problem is actually that the needed time ends up as a totally non-predictable value which makes it impossible to be planned during a maintenance window or to even have a rough idea in how much time one can restart/restore the infrastructure in case of a failure. So no scheduling at all.

Particularly if there are multiple containers (in my case 7 in number) with more dependency logic than just wait for a say single DB container (probably the trivial and most common case) the system could even enter into a kind of a "deadlock" state where one container gets restarted and during that another one because the first one is not available, but in the meantime the first is booted and sees the other one is not available and so on.
On my end I've waited for ~20 minutes and the swarm manager was keeping restarting but the correct order was still not identified.

In fact it really depends on the APP inside the container(s) - it really makes no sense to restart if the app actually takes more time to boot as this way startup would consume much more time for the start process. Instead if it follows a logic with the needed order the overall time would increase.

Having a mechanism like we used to have for clear "dependency" logic configuration would not make any issue of that kind at all + will actually make the boot process faster than relying on automatic restart and waiting until the proper order is get.

Without being able to predict, schedule during maintenance and knowing the behavior in general + at least approximately plan how much time the docker-swarm needs to start the real production operation would be close to impossible for any more-complex than the trivial (as expected to be "trivial" mentioned above) microservice environment which has more dependency logic within it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/stackarea/swarmkind/enhancementEnhancements are not bugs or new features but can improve usability or performance.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions