-
Notifications
You must be signed in to change notification settings - Fork 18.8k
API: Allow for container IDs to be forced through the remote API. #9854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Forced IDs must be in the exact same format as the ones generated automatically and be unique (as in, not already in use). Signed-off-by: Andrea Luzzardi <aluzzardi@gmail.com>
LGTM |
That's a pretty big change with potentially lots of implications. Without a compelling reason to make the change I am compelled to say no. What's the rationale? |
We implement this and it is not long before someone makes a PR for |
I agree with Jess, however I suspect Andrea and Victor have a reasonable On Tuesday, December 30, 2014, Jessie Frazelle notifications@github.com
|
It seems the worst that will happen is that users will get "Container ID is already in use" errors. ValidateID is somewhat incomplete for container IDs, however. The GenerateRandomID function also excludes all-numeric IDs. If these IDs are supposed to be considered global, we might want to consider that the all-numeric exclusion removes a significant chunk of otherwise valid IDs before we even consider the birthday problem. |
Eric, no need to dive too far into possible problems (and remediation) On Tue, Dec 30, 2014 at 1:17 PM, Eric Windisch notifications@github.com
|
Our use case would be swarm, to be able to have the same IDs through swarm and when you go directly on a machine, also when we reschedule a container, the ID stays the same, without us having to maintain a mapping. @ibuildthecloud seems also very interested in this feature for his project(s) |
We already discussed this and I already gave you my opinion: a "virtual container" in the swarm is not the same thing as the actual container mapped to it at any given time. They are fundamentally different objects, and they should have different IDs. Otherwise we expose ourselves to all sorts of edge cases and naming conflicts. There will always be a mapping somewhere. It's not a good idea to sweep it under the rug. And I really want to keep a solid foundation of globally unique objects with an immutable ID. So I'm not supportive of this change, sorry. |
@shykes This PR doesn't have to be the solution, but I'll explain the issue we need solved. Here's the basic problem (@shykes, I know you've heard this before, but for everyone else...). Imagine you have a higher level system managing your containers (swarm being the perfect example). So I create a virtual container A in swarm and when I start the container on a host it is container Y. Then the host dies and I start virtual container A on a different host as container Z. I need to know that Y and Z are containers for the virtual A. So a mapping needs to be kept somewhere. There are two issues with keeping this mapping outside of Docker, the first technical, the second a usability concern. The first issue is idempotency. Imagine I start virtual container A on the host and it creates container Y but right after the container starts my code/agent/whatever dies before it can record Y (or you have a networking partition, etc). Then I want to cleanup or redo the action. If I look at the host and I have container Y running there, I have no way to know that that container is really A or is a container manually started by someone else. In order to make the start container operation idempotent, the easiest way (and the sanest that I know), is to set virtual container ID "A" on the container during create (in some fashion, Rancher currently abused the container name - not a good solution). That way if something fails you can later list the containers and see that the newly created X was in fact for virtual container A. The second issue is that if virtual container A is running on a host and a user logs in and does a ps, how do they know that container X is virtual container A. So it would be nice if the user could easily correlate the two. So there are a couple solutions here.
One important requirement is that if set the virtual container ID, I need to be able to efficiently look up the container by that ID. I can't do a docker ps and then inspect on each one. This is also why setting the container ID is the simplest (but maybe most dangerous) solution. |
I think the sanest approach is option 3: arbitrary annotations, namespaced by calling application. It is much more future-proof than allowing the external caller to set the actual ID. |
In option 1, what would happen if I clustered two machines together with |
If we go with 3, the important thing I'm trying to point out is that there's a real requirement for this. It's hard to sanely track Docker container from an external system like Swarm or Rancher. (unless you assume the management system is the sole owner of the box, but that is no fun...) |
@ibuildthecloud I completely agree, option 3 is something we need for many reasons. We simply need a straightforward way to annotate every object in Docker, with simple namespacing so that different callers can annotate the same object without conflicts. |
Throwing in my vote for option 3 as well. 1 & 2 just feel wrong. I can think of several situations where they could cause issues. |
👍 Arbitrary metadata. We desperately need that anyway. Here's a proposal, though I don't like the word "annotate". I might put together my own proposal that aggregates the dozens of issues that already exist about this. |
Fwiw; +1 on option 3. Meta-data keeps coming up in various issues. @bfirsh; I collected some issues the other day here; #9841 (comment) might save you some time collecting |
This is the same concept as "Virtual IDs" except that they are physically mapped back to the container. After using Virtual IDs ourselves without mapping, we found out that the user experience is really bad. Users will always end up accessing single nodes for various reasons, and the mapping just makes it a a major hassle to do so. This change does not alter the principle of globally unique objects with an immutable ID. |
@aluzzardi Most people seem to be on board with the idea of option 3. Do you think that is doable? So Swarm would have container id 42 and you would essentially |
Sorry, I couldn't help myself but create yet another meta data PR. #9882 If you think this is not helpful, I'll close it. We just need to move forward.... |
@ibuildthecloud It would technically work, but usability wise it would be inconvenient. For instance, running a |
ie, something like |
@tianon I think Andrea means that the So something like |
@thaJeztah Indeed! I'm simply worried about the user experience and not ending up with a "monster" implementation to support this. Although I agree that listing labels in Whether it's Swarm or other clustering solutions, users will always end up SSH'ing to nodes directly (very common for small deployments, but it will also happen on larger deployments for debugging purposes). I want to make sure we fully support that use case and don't end up with a central black box. |
So in favor of cleaning house and pushing everything to the edge we are going to close this for #9882, because the discussion has lead to that being a better route. |
Forced IDs must be in the exact same format as the ones generated
automatically and be unique (as in, not already in use).
Signed-off-by: Andrea Luzzardi aluzzardi@gmail.com