API: Allow for container IDs to be forced through the remote API. #9854

aluzzardi · 2014-12-30T15:53:35Z

Forced IDs must be in the exact same format as the ones generated
automatically and be unique (as in, not already in use).

Signed-off-by: Andrea Luzzardi aluzzardi@gmail.com

Forced IDs must be in the exact same format as the ones generated automatically and be unique (as in, not already in use). Signed-off-by: Andrea Luzzardi <aluzzardi@gmail.com>

aluzzardi · 2014-12-30T15:54:20Z

/cc @vieux @ibuildthecloud

vieux · 2014-12-30T20:18:18Z

LGTM

shykes · 2014-12-30T20:27:47Z

That's a pretty big change with potentially lots of implications. Without a compelling reason to make the change I am compelled to say no. What's the rationale?

jessfraz · 2014-12-30T20:41:21Z

We implement this and it is not long before someone makes a PR for --id to docker create...
but despite that (because we can't predict the future)
How can we trust users to make unique id's themselves without messing things up, we implement this and people will use it

shykes · 2014-12-30T20:49:11Z

I agree with Jess, however I suspect Andrea and Victor have a reasonable
requirement in mind, let's talk about that requirement then we can take
another look at the best solution.

On Tuesday, December 30, 2014, Jessie Frazelle notifications@github.com
wrote:

We implement this and it is not long before someone makes a PR for --id
to docker create...
but despite that (because we can't predict the future)
How can we trust users to make unique id's themselves without messing
things up, we implement this and people will use it

—
Reply to this email directly or view it on GitHub
#9854 (comment).

ewindisch · 2014-12-30T21:17:08Z

It seems the worst that will happen is that users will get "Container ID is already in use" errors.

ValidateID is somewhat incomplete for container IDs, however. The GenerateRandomID function also excludes all-numeric IDs.

If these IDs are supposed to be considered global, we might want to consider that the all-numeric exclusion removes a significant chunk of otherwise valid IDs before we even consider the birthday problem.

shykes · 2014-12-30T21:19:56Z

Eric, no need to dive too far into possible problems (and remediation)
until we establish a solid reason to even consider it.

On Tue, Dec 30, 2014 at 1:17 PM, Eric Windisch notifications@github.com
wrote:

It seems the worst that will happen is that users will get "Container ID
is already in use" errors.

ValidateID is somewhat incomplete for container IDs, however. The
GenerateRandomID function also excludes all-numeric IDs.

If these IDs are supposed to be considered global, we might want to
consider that the all-numeric exclusion removes a significant chunk of
otherwise valid IDs before we even consider the birthday problem.

—
Reply to this email directly or view it on GitHub
#9854 (comment).

vieux · 2014-12-30T21:38:02Z

Our use case would be swarm, to be able to have the same IDs through swarm and when you go directly on a machine, also when we reschedule a container, the ID stays the same, without us having to maintain a mapping.

@ibuildthecloud seems also very interested in this feature for his project(s)

shykes · 2014-12-30T21:41:43Z

We already discussed this and I already gave you my opinion: a "virtual container" in the swarm is not the same thing as the actual container mapped to it at any given time. They are fundamentally different objects, and they should have different IDs. Otherwise we expose ourselves to all sorts of edge cases and naming conflicts.

There will always be a mapping somewhere. It's not a good idea to sweep it under the rug. And I really want to keep a solid foundation of globally unique objects with an immutable ID. So I'm not supportive of this change, sorry.

ibuildthecloud · 2014-12-31T00:29:34Z

@shykes This PR doesn't have to be the solution, but I'll explain the issue we need solved. Here's the basic problem (@shykes, I know you've heard this before, but for everyone else...). Imagine you have a higher level system managing your containers (swarm being the perfect example). So I create a virtual container A in swarm and when I start the container on a host it is container Y. Then the host dies and I start virtual container A on a different host as container Z. I need to know that Y and Z are containers for the virtual A. So a mapping needs to be kept somewhere. There are two issues with keeping this mapping outside of Docker, the first technical, the second a usability concern.

The first issue is idempotency. Imagine I start virtual container A on the host and it creates container Y but right after the container starts my code/agent/whatever dies before it can record Y (or you have a networking partition, etc). Then I want to cleanup or redo the action. If I look at the host and I have container Y running there, I have no way to know that that container is really A or is a container manually started by someone else. In order to make the start container operation idempotent, the easiest way (and the sanest that I know), is to set virtual container ID "A" on the container during create (in some fashion, Rancher currently abused the container name - not a good solution). That way if something fails you can later list the containers and see that the newly created X was in fact for virtual container A.

The second issue is that if virtual container A is running on a host and a user logs in and does a ps, how do they know that container X is virtual container A. So it would be nice if the user could easily correlate the two.

So there are a couple solutions here.

Allow the caller to set the container ID. This is by far the simplest approach, but I understand the "don't f*#$ with my IDs" argument.
Add another specific field like UUID. This is how libvirt and kvm do it. KVM has an argument called -uuid that it doesn't really care about but is there just so that an external system can assign the ID for tracking.
Add arbitrary meta data to containers. I can then set something like docker run -m VIRTUAL_CONTIANER_ID=A.

One important requirement is that if set the virtual container ID, I need to be able to efficiently look up the container by that ID. I can't do a docker ps and then inspect on each one. This is also why setting the container ID is the simplest (but maybe most dangerous) solution.

shykes · 2014-12-31T00:34:27Z

I think the sanest approach is option 3: arbitrary annotations, namespaced by calling application. It is much more future-proof than allowing the external caller to set the actual ID.

tianon · 2014-12-31T00:38:26Z

In option 1, what would happen if I clustered two machines together with
the same IDs on several unrelated containers? :)

ibuildthecloud · 2014-12-31T00:45:06Z

@tianon @shykes Personally I like option 3. 1 and 2 could just be possibly easier to implement and don't require long UI discussions :)

ibuildthecloud · 2014-12-31T00:56:45Z

If we go with 3, the important thing I'm trying to point out is that there's a real requirement for this. It's hard to sanely track Docker container from an external system like Swarm or Rancher. (unless you assume the management system is the sole owner of the box, but that is no fun...)

shykes · 2014-12-31T01:11:57Z

@ibuildthecloud I completely agree, option 3 is something we need for many reasons. We simply need a straightforward way to annotate every object in Docker, with simple namespacing so that different callers can annotate the same object without conflicts.

phemmer · 2014-12-31T06:12:38Z

Throwing in my vote for option 3 as well. 1 & 2 just feel wrong. I can think of several situations where they could cause issues.
Option 3 is actually the first thing that popped into my mind reading ibuildthecloud's use case, and I think it has numerous other benefits as well. One use case is that we're considering writing our own scheduler for swarm, and this arbitrary metadata would help us so that the scheduler can determine where to launch the container (for things like "this container needs a host with a GPU")

bfirsh · 2014-12-31T10:03:23Z

👍 Arbitrary metadata. We desperately need that anyway.

Here's a proposal, though I don't like the word "annotate". I might put together my own proposal that aggregates the dozens of issues that already exist about this.

thaJeztah · 2014-12-31T13:32:02Z

Fwiw; +1 on option 3. Meta-data keeps coming up in various issues.

@bfirsh; I collected some issues the other day here; #9841 (comment) might save you some time collecting

aluzzardi · 2015-01-02T11:29:24Z

@shykes

This is the same concept as "Virtual IDs" except that they are physically mapped back to the container.

After using Virtual IDs ourselves without mapping, we found out that the user experience is really bad. Users will always end up accessing single nodes for various reasons, and the mapping just makes it a a major hassle to do so.

This change does not alter the principle of globally unique objects with an immutable ID.

ibuildthecloud · 2015-01-03T05:07:17Z

@aluzzardi Most people seem to be on board with the idea of option 3. Do you think that is doable? So Swarm would have container id 42 and you would essentially docker run --label io.docker.swarm=id:42 ... which would create container xyz. So the user would then do a docker ps and see that container xyz has the swarm id of 42.

ibuildthecloud · 2015-01-03T06:19:09Z

Sorry, I couldn't help myself but create yet another meta data PR. #9882 If you think this is not helpful, I'll close it. We just need to move forward....

aluzzardi · 2015-01-05T15:20:08Z

@ibuildthecloud It would technically work, but usability wise it would be inconvenient.

For instance, running a docker ps on a node would yield "garbage" IDs (as in, not directly usable).

tianon · 2015-01-05T21:40:22Z

ie, something like docker ps --filter swarm=some-swarm-id ? or docker ps --filter swarm-id=some-specific-swarm-container-id ?

thaJeztah · 2015-01-05T21:52:02Z

@tianon I think Andrea means that the ID that is shown in docker ps is not the id that swarm uses to identify the container, which makes it confusing.

So something like docker ps --show-labels=swarm-id to make docker ps output a custom column containing the value of the swarm-id label for each container.

aluzzardi · 2015-01-06T15:08:16Z

@thaJeztah Indeed! I'm simply worried about the user experience and not ending up with a "monster" implementation to support this.

Although I agree that listing labels in docker ps might be useful for other use cases and may be a workable solution.

Whether it's Swarm or other clustering solutions, users will always end up SSH'ing to nodes directly (very common for small deployments, but it will also happen on larger deployments for debugging purposes). I want to make sure we fully support that use case and don't end up with a central black box.

jessfraz · 2015-01-06T18:31:38Z

So in favor of cleaning house and pushing everything to the edge we are going to close this for #9882, because the discussion has lead to that being a better route.

API: Allow for container IDs to be forced through the remote API.

3928b4b

Forced IDs must be in the exact same format as the ones generated automatically and be unique (as in, not already in use). Signed-off-by: Andrea Luzzardi <aluzzardi@gmail.com>

jamtur01 added the /project/doc label Dec 31, 2014

ibuildthecloud mentioned this pull request Jan 3, 2015

Proposal: One Meta Data to Rule Them All => Labels #9882

Merged

jessfraz closed this Jan 6, 2015

aluzzardi deleted the api-specify-id branch May 7, 2015 06:42

API: Allow for container IDs to be forced through the remote API. #9854

API: Allow for container IDs to be forced through the remote API. #9854

Uh oh!

Conversation

aluzzardi commented Dec 30, 2014

Uh oh!

aluzzardi commented Dec 30, 2014

Uh oh!

vieux commented Dec 30, 2014

Uh oh!

shykes commented Dec 30, 2014

Uh oh!

jessfraz commented Dec 30, 2014

Uh oh!

shykes commented Dec 30, 2014

Uh oh!

ewindisch commented Dec 30, 2014

Uh oh!

shykes commented Dec 30, 2014

Uh oh!

vieux commented Dec 30, 2014

Uh oh!

shykes commented Dec 30, 2014

Uh oh!

ibuildthecloud commented Dec 31, 2014

Uh oh!

shykes commented Dec 31, 2014

Uh oh!

tianon commented Dec 31, 2014

Uh oh!

ibuildthecloud commented Dec 31, 2014

Uh oh!

ibuildthecloud commented Dec 31, 2014

Uh oh!

shykes commented Dec 31, 2014

Uh oh!

phemmer commented Dec 31, 2014

Uh oh!

bfirsh commented Dec 31, 2014

Uh oh!

thaJeztah commented Dec 31, 2014

Uh oh!

aluzzardi commented Jan 2, 2015

Uh oh!

ibuildthecloud commented Jan 3, 2015

Uh oh!

ibuildthecloud commented Jan 3, 2015

Uh oh!

aluzzardi commented Jan 5, 2015

Uh oh!

tianon commented Jan 5, 2015

Uh oh!

thaJeztah commented Jan 5, 2015

Uh oh!

aluzzardi commented Jan 6, 2015

Uh oh!

jessfraz commented Jan 6, 2015

Uh oh!

Uh oh!