-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Proposal: using registry images for build caching #24711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: using registry images for build caching #24711
Conversation
@KJTsanaktsidis I think you link the wrong issue number in your post, I think it's #20316 not 23016. I corrected it, please correct it if I'm wrong -:) |
* People using disposable VM's in development who rebuilt docker images when the VM came up | ||
|
||
## Image save/load based solution | ||
Following the discussion about this issue in #23016, #21385 was merged which kept parent references when exporting/importing images with `docker save` & `docker load`. Instead of doing a registry pull prior to a build, this functionality allows saving a built image and loading it onto the next machine to build that image, roughly emulating the behaviour of the previous push/pull workaround. However, this solution has some problems (some of which were also issues with the push/pull workaround) - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/23016/20316
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops - yeah that one. People at work joke that I'm allergic to copy/paste...
fa3f086
to
7430401
Compare
8be38c9
to
fcc955a
Compare
ping @tonistiigi @stevvooe PTAL |
@KJTsanaktsidis https://github.com/tonistiigi/buildcache is an incomplete but completely functional tool to save build cache for docker v1.11+ that doesn't have most of the problems listed in your description of load/save method. The cache file is small(around 1KB) and doesn't have much overhead. Regarding your interface example, I'm not sure I understand how this would work. How would a plugin get this data so it can make a decision like this. We do not want to implement alternative distribution scheme of images. Also, it's impossible for a plugin to find an image based on just runconfig(unless it only looks local images like atm), if it did that we would expose the security problem again as anyone can fill this cache with the same runconfig but completely different image. The cache can only come from a trusted location for a specific repo. |
I am generally confused by this proposal. Cache poisoning issue is a real problem. The worst part about is that it is not at all obvious that you are affected or protected. It can only be mitigated by limiting the horizon of data that one trusts. Anything that circumvents that protection, even The other aspect to this is the misapplied assumption about the idempotence of shell commands. That said, I recognize that people rely on this behavior and there are environments that can apply some amount of trust to a distributed build cache.
Why don't we just fix this rather than piling on a bunch of extra code? Furthermore, why don't we leverage the image registry in the process? Let's look at how we build an image, with
In this simple case, we cannot assume that a remote What better way than to tell the build process process that you can trust a related image? docker build --cache-from mysuperapp:v0 -t mysuperapp:v1 . The above would allow No! We still have a problem. Now, my build system has to know tag lineage (mysuperapp:v0
In the above, we pull all the tags from
There are many possibilities here to make this more flexible, such as running a registry service specially for the purpose of build caching We can take this even further, but I hope the point is brought home. This is probably less convenient that the original behavior, but it is a good trade off. It leverages the existing infrastructure and has the possibility of being extended as use cases change. |
@stevvooe v2/schema2 manifests have a single image configuration, so you don't get a chain of configurations on pull/push. |
Discussed more with @stevvooe about the ping @aaronlehmann @dmcgowan |
I agree with proposed addition of
Not sure diverging from the current expected behavior could be done here. The command should be able to pull the cached from image and doing the equivalent of In terms of the POC implementation, already mentioned this should pull the cached image if it does not exist to be consistent with other commands. This would also need to do a trusted reference lookup using Docker Content Trust. Being able to do that makes this method much more secure and reliable than trying to resolve the cache in a daemon side plugin. |
I like the I agree with @dmcgowan that pulling all tags as a result of a |
@stevvooe - regarding the idempotency of shell commands, that's already an issue with local build caches. People already have to deal with this by, for example, explicitly specifying versions they want, or running every nth build with @tonistiigi - I definitely agree that having to implement an alternate distribution scheme for images is a bit of a drag, but I also didn't want to turn the registry into something it's not. If we can come to a solution that leverages the registry as-is, then that would definitely involve substantially less code overall (between Docker & any hypothetical plugins) and I'd be all for it. @tonistiigi - regarding your plan for implementing
What I don't understand is how we would get the Dockerfile command from the registry to compare. I poked around at the HTTP response you get from |
@KJTsanaktsidis I guess at least initially we would use both methods so we don't break people. We only use the new method if |
Right - I wasn't sure if access to the I'll try and find some time to update the proposal this week based on this discussion. Thanks a bunch for all your input! |
Hey, just an update- I am still planning to work on this but I didn't manage to squeeze it in before I went away. I'm on holidays now for a couple of weeks but I will keep working on this proposal when I get back. Sorry to break the momentum! |
5aa0eef
to
40eef94
Compare
@tonistiigi @stevvooe PTAL - I've rewritten the proposal to be a proposal on how to implement |
@KJTsanaktsidis Thanks for the help here! The |
@icecrime I've been hacking at a prototype this week while I've been away, and have mostly got all the tentacles in the right place and had a good go at the core algorithm too. I'll try and finish it on the plane this weekend and push it up here for review :) |
I think I can commit to implementing it and getting it merged though - I'll discuss with work on Monday and confirm this. |
} | ||
|
||
// Pluck out the newest layer from the potential matches | ||
// Pointer so it can be nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of newest we should probably use the order specified by the user in cache-from
.
ping @tianon I remember you had an issue with how we stringify the command and potentially lose information during it. If you still see it as a problem then now seems a good time to address it before we make the build cache rely on this stringification. |
Yeah! The discussion over in #22436 has some of the details of what I find funky about it. The way "build args" are stringified is definitely a pretty massive issue, since the formatted array we used to output (prefixed with Additionally, I think it's kind of strange that we still use the |
// ImageCacheForBuild represents a stateful image cache that is wound forward | ||
// at each step of the build process, until either there is a cache miss or | ||
// the image is built. | ||
type ImageCacheForBuild interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be best to retain the naming of this interface and then add a ImageCacheBuilder
or something.
All in all, this is looking promising. #24711 (comment) has me concerned. Images should not be created as a result of using |
@stevvooe Unless we diverge from "every Dockerfile command creates and image/every command executes and then commits" rule of our builder, we have to create intermediate images because they just don't exist after pull. Problem atm is that the cache function itself creates this image what isn't very clean. Although moving this creation to builder pkg where it would happen on cache miss would add quite a lot of complexity. |
In general I kind of like the idea of not creating images that will serve no actual purpose after the build. Unfortunately, the way the builder is written at the moment kind of depends on the whole "Image + Dockerfile step -> Image" workflow. I can see an argument that building a Dockerfile should not create an intermediate image at each step. However that change seems pretty orthogonal to how this works, especially once I factor out the relevant code between this and |
Thanks everybody for the really great feedback. I'll come around at this again in the next couple of days with an improved version taking all of the above into account. |
@tianon If we plan to make the |
It does seem like a problem because these images would be seen by other builds that don't use |
@aaronlehmann What do you expect to happen when you use |
That seems fine, but I'm more concerned about the case where it does match something. That would create "fake" images that get matched by other builds that aren't using |
@aaronlehmann I'm not sure I get why this is different or why these images are "fake". Both cases you get intermediate images. Only difference is if the layer inside an intermediate image is shared with some other image or not. |
@tonistiigi Let's put it this way: does If we can't avoid this, we'll need to have a separate endpoint that populates the build cache and modifies the entire engine, such that builds after that call will use the new cache entries. |
Every build creates cache for all the Dockerfile commands. Independently from if |
@KJTsanaktsidis Are you still working on this? It's ok if you want us to carry this PR as we hope to get it into |
@tonistiigi Unfortunately not - I tried to hassle management at work the other day for a week or so to get this mergeable, but it didn't fly. Heaps of operations issues to deal with at the moment... So I think I'm going to have to hand this over to you. I would like to thank you guys heaps for talking this over with me, I learned heaps about how Docker is put together. |
opened #26839 |
Thank you very much for your help @KJTsanaktsidis! You didn't pick the easiest problem, and you did wonders ;-) Tonis will carry but preserve your original commits so you still get the credit 👍 |
This commit adds a proof-of-concept implementation of the --cache-from feature outlined in moby#24711. The parts that have been implemented so far include: * Accepting `--cache-from` as an argument to `docker build`, and passing it through to the builder * Changing the ImageCache interface to be based on a new context per build, so that the cache can track build-specific things like "have I pulled this `--cache-from` image yet?" * Added the ability to use layers from registry images specified as `--cache-from` to the Daemon (conforming to the ImageCache interface). If a layer is found that can be arrived to by the same sequence of commands that the current build has run, it can create a new container to form part of the build's parent chain using the existing layer and without needing to run the command. Signed-off-by: KJ Tsanaktsidis <kjtsanaktsidis@gmail.com>
- What I did
In this PR, I propose a plugin mechanism for the daemon to request
cached images from a plugin. I believe a plugin is appropriate here
because there are several different ways such sharing might work
depending on the lifecycle of the machines being used.
- How I did it
N/A
- How to verify it
N/A
- Description for the changelog
Proposal: plugin mechanism for sharing image cache between machines
- A picture of a cute animal (not mandatory but encouraged)

As has been discussed extensively in #20316 , Docker now no longer
restores parent image chains when pulling from a registry, and people
who relied on this to cache builds across multiple machines have had
their build times blow out.
I'm PR'ing this to get some feedback from the maintainers & community on
this idea. If people agree that this is the way forward, I'd like to
start working on this. It doesn't appear to me that the implementation
of the plugin surface would be especially difficult or disruptive - but
I want your thoughts on this :)
EDIT: correct the issue link