Proposal: using registry images for build caching #24711

KJTsanaktsidis · 2016-07-16T02:45:22Z

- What I did
In this PR, I propose a plugin mechanism for the daemon to request
cached images from a plugin. I believe a plugin is appropriate here
because there are several different ways such sharing might work
depending on the lifecycle of the machines being used.

- How I did it
N/A

- How to verify it
N/A

- Description for the changelog
Proposal: plugin mechanism for sharing image cache between machines

- A picture of a cute animal (not mandatory but encouraged)

As has been discussed extensively in #20316 , Docker now no longer
restores parent image chains when pulling from a registry, and people
who relied on this to cache builds across multiple machines have had
their build times blow out.

I'm PR'ing this to get some feedback from the maintainers & community on
this idea. If people agree that this is the way forward, I'd like to
start working on this. It doesn't appear to me that the implementation
of the plugin surface would be especially difficult or disruptive - but
I want your thoughts on this :)

EDIT: correct the issue link

coolljt0725 · 2016-07-18T06:16:13Z

@KJTsanaktsidis I think you link the wrong issue number in your post, I think it's #20316 not 23016. I corrected it, please correct it if I'm wrong -:)

coolljt0725 · 2016-07-18T06:20:08Z

daemon/image_cache_plugin_proposal.md

+* People using disposable VM's in development who rebuilt docker images when the VM came up
+
+## Image save/load based solution
+Following the discussion about this issue in #23016, #21385 was merged which kept parent references when exporting/importing images with `docker save` & `docker load`. Instead of doing a registry pull prior to a build, this functionality allows saving a built image and loading it onto the next machine to build that image, roughly emulating the behaviour of the previous push/pull workaround. However, this solution has some problems (some of which were also issues with the push/pull workaround) -


s/23016/20316

Whoops - yeah that one. People at work joke that I'm allergic to copy/paste...

thaJeztah · 2016-08-04T15:43:00Z

ping @tonistiigi @stevvooe PTAL

tonistiigi · 2016-08-04T16:38:51Z

@KJTsanaktsidis https://github.com/tonistiigi/buildcache is an incomplete but completely functional tool to save build cache for docker v1.11+ that doesn't have most of the problems listed in your description of load/save method. The cache file is small(around 1KB) and doesn't have much overhead.

Regarding your interface example, I'm not sure I understand how this would work. How would a plugin get this data so it can make a decision like this. We do not want to implement alternative distribution scheme of images. Also, it's impossible for a plugin to find an image based on just runconfig(unless it only looks local images like atm), if it did that we would expose the security problem again as anyone can fill this cache with the same runconfig but completely different image. The cache can only come from a trusted location for a specific repo.

stevvooe · 2016-08-04T20:03:12Z

I am generally confused by this proposal. Cache poisoning issue is a real problem. The worst part about is that it is not at all obvious that you are affected or protected. It can only be mitigated by limiting the horizon of data that one trusts.

Anything that circumvents that protection, even docker save/load, is going to open your infrastructure up to injection of malicious content. This proposal, and none of the proposals in #20316, address this problem. While we all want fast builds (and I really do), introducing cache poisoning to the build step of the infrastructure. Could you imagine the impact if someone could just inject a malicious layer into library/ubuntu or library/alpine?

The other aspect to this is the misapplied assumption about the idempotence of shell commands. apt-get update run twice is never guaranteed to have the same result. Ever. That is just not how it works. If you have a build cache that is never purged, you will never update your upstream software. That may or may not be the intent. Even worse, if this build cache gets filled in with remote data, you probably have no visibility into when that command was run.

That said, I recognize that people rely on this behavior and there are environments that can apply some amount of trust to a distributed build cache.

Docker now no longer
restores parent image chains when pulling from a registry, and people
who relied on this to cache builds across multiple machines have had
their build times blow out.

Why don't we just fix this rather than piling on a bunch of extra code? Furthermore, why don't we leverage the image registry in the process?

Let's look at how we build an image, with FROM alpine at the top:

docker pull mysuperapp:v0
docker build -t mysuperapp:v1 .

In this simple case, we cannot assume that a remote mysuperapp:v0 and the ongoing build are related, since that would possibly introduce the cache poisoning scenario that we need to avoid. However, one may have local registry infrastructure that they know they can trust. While we can infer parentage (despite other assertions, this is still possible), we may not be able trust that parentage from a build caching perspective from all registries. But, this build environment is special.

What better way than to tell the build process process that you can trust a related image?

docker build --cache-from mysuperapp:v0 -t mysuperapp:v1 .

The above would allow Dockerfile commands to be satisfied from the entries of mysuperapp:v0 in the build of mysuperapp:v1. Job done!

No! We still have a problem. Now, my build system has to know tag lineage (mysuperapp:v0<mysuperapp:v1`). Let's modify the meaning the tagless reference to mean something slightly different:

docker pull -a mysuperapp
docker build --cache-from mysuperapp -t mysuperapp:v1 .

In the above, we pull all the tags from mysuperapp, any layer of which can satisfy the build cache. In practice, this probably is a little wide for most scenarios, so we can allow multiple --cache-from directives on the command line:

docker build --cache-from mysuperapp:v0 docker build --cache-from mysuperapp:v1 -t mysuperapp:v2 .

There are many possibilities here to make this more flexible, such as running a registry service specially for the purpose of build caching mybuildcache.internal/mysuperapp. Did you know that you can just run a registry and rsync the filesystem around without locking? You can also rsync from multiple sources and merge the result safely (kind of). Such a registry can be purged periodically (or some one could submit a PR to purge old data).

We can take this even further, but I hope the point is brought home. This is probably less convenient that the original behavior, but it is a good trade off. It leverages the existing infrastructure and has the possibility of being extended as use cases change.

tonistiigi · 2016-08-04T21:24:23Z

@stevvooe v2/schema2 manifests have a single image configuration, so you don't get a chain of configurations on pull/push.

tonistiigi · 2016-08-05T21:07:05Z

Discussed more with @stevvooe about the --cache-from proposal and I think it should be possible to implement that even without having the parent chain locally and only using the array of history items in image configurations. That would mean redo of our cache mechanism to not only compare image configurations but compare the Dockerfile commands directly as well. We would always generate a new image but if we can match a command that changed the rootfs we can directly use the existing layer array instead. Definitely seems worth investigating further and maybe put together a poc.

ping @aaronlehmann @dmcgowan

dmcgowan · 2016-08-08T16:03:11Z

I agree with proposed addition of --cache-from, it is simple and I think in correct solution here would require an explicit flag.

Now, my build system has to know tag lineage (mysuperapp:v0<mysuperapp:v1`). Let's modify the meaning the tagless reference to mean something slightly different:

Not sure diverging from the current expected behavior could be done here. The command should be able to pull the cached from image and doing the equivalent of pull -a automatically would not be ideal for the unexpecting user and has the potential to slow down builds much more. I would like to hear from user's what the challenges may be in figuring out the build cache tag since there is a variety of tagging strategies.

In terms of the POC implementation, already mentioned this should pull the cached image if it does not exist to be consistent with other commands. This would also need to do a trusted reference lookup using Docker Content Trust. Being able to do that makes this method much more secure and reliable than trying to resolve the cache in a daemon side plugin.

aaronlehmann · 2016-08-09T00:12:58Z

I like the --cache-from proposal. Definitely in favor of putting together a proof of concept.

I agree with @dmcgowan that pulling all tags as a result of a --cache-from argument may be unexpected behavior that slows down builds more than it helps. It may be better to only pull the latest tag in when no tag is given, and allow users to explicitly use pull -a to warm the build cache when appropriate.

KJTsanaktsidis · 2016-08-09T00:38:12Z

@stevvooe - regarding the idempotency of shell commands, that's already an issue with local build caches. People already have to deal with this by, for example, explicitly specifying versions they want, or running every nth build with --no-cache

@tonistiigi - I definitely agree that having to implement an alternate distribution scheme for images is a bit of a drag, but I also didn't want to turn the registry into something it's not. If we can come to a solution that leverages the registry as-is, then that would definitely involve substantially less code overall (between Docker & any hypothetical plugins) and I'd be all for it.

@tonistiigi - regarding your plan for implementing --cache-from: Can I see if I've understood this correctly?

At the moment, Docker finds cached images by finding images who's parent is the previous build step and who share the same runconfig as the next step
The registry does not store parent image chains any more, but it does store a list of sha256 hashes of all the filesystem layers built up to create the image which has been tagged in the registry
You propose that instead of finding images as part of cache resolution, we try and find layers. To do this, we would compare the parent layer hash to the previous build step, and the Dockerfile command that generated the layer with the currently building Dockerfile.

What I don't understand is how we would get the Dockerfile command from the registry to compare. I poked around at the HTTP response you get from GET /v2/image_name/manifests/latest, and the Dockerfile information is only available in a field called "v1Compatability", which doesn't seem promising. How would we get that information?

tonistiigi · 2016-08-09T00:53:28Z

@KJTsanaktsidis I guess at least initially we would use both methods so we don't break people. We only use the new method if --cache-from is set and it points to an image with no parents. Dockerfile command is listed in v1Compatibity field if you look at the schema1 compatibility manifest. Since v1.10/registry2.3 there is a history array in the image configuration that includes the Dockerfile commands. You can pull the manifest with this config with Accept: application/vnd.docker.distribution.manifest.v2+json header and then the config using the blobstore endpoints.

KJTsanaktsidis · 2016-08-09T01:02:29Z

Right - I wasn't sure if access to the v1Compatibility field would be frowned upon/deprecated. If that's data we're allowed to access then that all sounds good to me.

I'll try and find some time to update the proposal this week based on this discussion. Thanks a bunch for all your input!

KJTsanaktsidis · 2016-08-18T00:49:12Z

Hey, just an update- I am still planning to work on this but I didn't manage to squeeze it in before I went away. I'm on holidays now for a couple of weeks but I will keep working on this proposal when I get back. Sorry to break the momentum!

KJTsanaktsidis · 2016-08-20T09:32:05Z

@tonistiigi @stevvooe PTAL - I've rewritten the proposal to be a proposal on how to implement --cache-from. I'll try and get a prototype done this week but we'll see how we go.

icecrime · 2016-08-23T21:53:13Z

@KJTsanaktsidis Thanks for the help here! The --cache-from proposal seems really valuable, and feels like something we should really try to get in for 1.13.0. Can you please let us know if you find time working on it? If not, we might ask one of the maintainers to take it over, and we'll ping you for review/feedback.

KJTsanaktsidis · 2016-08-25T12:57:16Z

@icecrime I've been hacking at a prototype this week while I've been away, and have mostly got all the tentacles in the right place and had a good go at the core algorithm too. I'll try and finish it on the plane this weekend and push it up here for review :)

KJTsanaktsidis · 2016-08-25T13:00:04Z

I think I can commit to implementing it and getting it merged though - I'll discuss with work on Monday and confirm this.

tonistiigi · 2016-08-29T17:47:12Z

daemon/image.go

+	}
+
+	// Pluck out the newest layer from the potential matches
+	// Pointer so it can be nil


Instead of newest we should probably use the order specified by the user in cache-from.

tonistiigi · 2016-08-29T18:31:43Z

ping @tianon I remember you had an issue with how we stringify the command and potentially lose information during it. If you still see it as a problem then now seems a good time to address it before we make the build cache rely on this stringification.

tianon · 2016-08-29T20:57:28Z

Yeah! The discussion over in #22436 has some of the details of what I find funky about it. The way "build args" are stringified is definitely a pretty massive issue, since the formatted array we used to output (prefixed with |N followed by N number of array items) is now flattened and thus no longer parses correctly (ie, ["|2", "a=b c=d", "e=f g=h"] now stringifies the same as ["|2", "a=b", "c=d e=f g=h"] -- ie, a build-arg whose name is a and value is b c=d, etc, as one concrete example of where the current processing breaks down).

Additionally, I think it's kind of strange that we still use the # (nop) syntax when we could just re-create the raw Dockerfile instructions as-is. For example, RUN echo hi ought to translate directly to a "comment" for that line as RUN echo hi IMO (but the same for command which only modify metadata without creating a container, such as CMD ["a", "b"] translating to CMD ["a","b"] or something instead of sh -c # (nop) CMD [a b] or whatever it does currently).

stevvooe · 2016-08-30T02:54:32Z

builder/builder.go

+// ImageCacheForBuild represents a stateful image cache that is wound forward
+// at each step of the build process, until either there is a cache miss or
+// the image is built.
+type ImageCacheForBuild interface {


Might be best to retain the naming of this interface and then add a ImageCacheBuilder or something.

stevvooe · 2016-08-30T03:06:24Z

All in all, this is looking promising.

#24711 (comment) has me concerned. Images should not be created as a result of using --cache-from. The cache items satisfying --cache-from should only be valid for the duration of the build. Using that parameter should result in any new images being created.

tonistiigi · 2016-08-30T03:13:31Z

@stevvooe Unless we diverge from "every Dockerfile command creates and image/every command executes and then commits" rule of our builder, we have to create intermediate images because they just don't exist after pull. Problem atm is that the cache function itself creates this image what isn't very clean. Although moving this creation to builder pkg where it would happen on cache miss would add quite a lot of complexity.

KJTsanaktsidis · 2016-08-30T03:19:13Z

In general I kind of like the idea of not creating images that will serve no actual purpose after the build. Unfortunately, the way the builder is written at the moment kind of depends on the whole "Image + Dockerfile step -> Image" workflow.

I can see an argument that building a Dockerfile should not create an intermediate image at each step. However that change seems pretty orthogonal to how this works, especially once I factor out the relevant code between this and daemon.Commit. Both the build and the cache steps will then be using it, and we could make it do something else later.

KJTsanaktsidis · 2016-08-30T03:20:21Z

Thanks everybody for the really great feedback. I'll come around at this again in the next couple of days with an improved version taking all of the above into account.

KJTsanaktsidis · 2016-08-30T03:21:53Z

@tianon If we plan to make the CreatedBy in the history just the raw Dockerfile command, I can drill that through and check both?

aaronlehmann · 2016-08-30T03:31:14Z

Unless we diverge from "every Dockerfile command creates and image/every command executes and then commits" rule of our builder, we have to create intermediate images because they just don't exist after pull.

It does seem like a problem because these images would be seen by other builds that don't use --cache-from, unless I'm missing something. Altering the global build cache seems like an unexpected side effect, with potential security implications.

tonistiigi · 2016-08-30T03:34:11Z

@aaronlehmann What do you expect to happen when you use --cache-from and it doesn't match anything? I'd expect the build to behave the same way as it would if I didn't specify --cache-from.

aaronlehmann · 2016-08-30T03:53:16Z

That seems fine, but I'm more concerned about the case where it does match something. That would create "fake" images that get matched by other builds that aren't using --cache-from.

tonistiigi · 2016-08-30T03:56:33Z

@aaronlehmann I'm not sure I get why this is different or why these images are "fake". Both cases you get intermediate images. Only difference is if the layer inside an intermediate image is shared with some other image or not.

stevvooe · 2016-08-30T23:43:05Z

@tonistiigi Let's put it this way: does --cache-from change the state of the engine, such that images that weren't previously in the cache are now present? Will subsequent calls to build not require --cache-from to get the same effect?

If we can't avoid this, we'll need to have a separate endpoint that populates the build cache and modifies the entire engine, such that builds after that call will use the new cache entries.

tonistiigi · 2016-08-30T23:54:43Z

does --cache-from change the state of the engine, such that images that weren't previously in the cache are now present?

Every build creates cache for all the Dockerfile commands. Independently from if --cache-from was specified or if any commands got a cache hit or not. Atm we don't seem to be talking about modifying our current build cache methods or removing the concept of parent chain. Maybe this is also something we want to do in the future but I see these as totally independent questions. Current build cache method is that any locally built image always matches the cache if --no-cache wasn't specified. --cache-from itself doesn't change any engine state, building a new image does.

tonistiigi · 2016-09-19T22:35:30Z

@KJTsanaktsidis Are you still working on this? It's ok if you want us to carry this PR as we hope to get it into v1.13.

KJTsanaktsidis · 2016-09-20T23:16:59Z

@tonistiigi Unfortunately not - I tried to hassle management at work the other day for a week or so to get this mergeable, but it didn't fly. Heaps of operations issues to deal with at the moment... So I think I'm going to have to hand this over to you. I would like to thank you guys heaps for talking this over with me, I learned heaps about how Docker is put together.

tonistiigi · 2016-09-22T22:12:24Z

opened #26839

icecrime · 2016-09-23T00:27:16Z

Thank you very much for your help @KJTsanaktsidis! You didn't pick the easiest problem, and you did wonders ;-) Tonis will carry but preserve your original commits so you still get the credit 👍

This commit adds a proof-of-concept implementation of the --cache-from feature outlined in moby#24711. The parts that have been implemented so far include: * Accepting `--cache-from` as an argument to `docker build`, and passing it through to the builder * Changing the ImageCache interface to be based on a new context per build, so that the cache can track build-specific things like "have I pulled this `--cache-from` image yet?" * Added the ability to use layers from registry images specified as `--cache-from` to the Daemon (conforming to the ImageCache interface). If a layer is found that can be arrived to by the same sequence of commands that the current build has run, it can create a new container to form part of the build's parent chain using the existing layer and without needing to run the command. Signed-off-by: KJ Tsanaktsidis <kjtsanaktsidis@gmail.com>

GordonTheTurtle added status/1-design-review dco/no Automatically set by a bot when one of the commits lacks proper signature labels Jul 16, 2016

thaJeztah added area/distribution kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny labels Jul 16, 2016

coolljt0725 reviewed Jul 18, 2016
View reviewed changes

KJTsanaktsidis force-pushed the image_cache_plugin_proposal branch from fa3f086 to 7430401 Compare July 19, 2016 10:05

GordonTheTurtle removed the dco/no Automatically set by a bot when one of the commits lacks proper signature label Jul 19, 2016

KJTsanaktsidis force-pushed the image_cache_plugin_proposal branch 2 times, most recently from 8be38c9 to fcc955a Compare July 19, 2016 10:09

tonistiigi mentioned this pull request Aug 17, 2016

Pulling build cache #20316

Closed

GordonTheTurtle added the dco/no Automatically set by a bot when one of the commits lacks proper signature label Aug 20, 2016

KJTsanaktsidis force-pushed the image_cache_plugin_proposal branch from 5aa0eef to 40eef94 Compare August 20, 2016 09:30

GordonTheTurtle removed the dco/no Automatically set by a bot when one of the commits lacks proper signature label Aug 20, 2016

KJTsanaktsidis changed the title ~~Proposal: plugin mechanism for sharing image cache between machines~~ Proposal: using registry images for build caching Aug 20, 2016

tonistiigi reviewed Aug 29, 2016
View reviewed changes

stevvooe assigned stevvooe and tonistiigi Aug 30, 2016

stevvooe reviewed Aug 30, 2016
View reviewed changes

tonistiigi mentioned this pull request Sep 12, 2016

Caching not working with docker load in 1.10.0 #20380

Closed

lox mentioned this pull request Sep 20, 2016

Implement shared docker layer cache buildkite/elastic-ci-stack-for-aws#145

Closed

tonistiigi mentioned this pull request Sep 22, 2016

Implement build cache based on history array #26839

Merged

tonistiigi closed this Sep 22, 2016

mccalluc mentioned this pull request Jan 31, 2017

Pre caching of images doesn't seem to be working? higlass/higlass-docker#27

Closed

Proposal: using registry images for build caching #24711

Proposal: using registry images for build caching #24711

Uh oh!

Conversation

KJTsanaktsidis commented Jul 16, 2016 • edited by coolljt0725 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coolljt0725 commented Jul 18, 2016

Uh oh!

coolljt0725 Jul 18, 2016

Choose a reason for hiding this comment

Uh oh!

KJTsanaktsidis Jul 18, 2016

Choose a reason for hiding this comment

Uh oh!

thaJeztah commented Aug 4, 2016

Uh oh!

tonistiigi commented Aug 4, 2016

Uh oh!

stevvooe commented Aug 4, 2016

Uh oh!

tonistiigi commented Aug 4, 2016

Uh oh!

tonistiigi commented Aug 5, 2016

Uh oh!

dmcgowan commented Aug 8, 2016

Uh oh!

aaronlehmann commented Aug 9, 2016

Uh oh!

KJTsanaktsidis commented Aug 9, 2016

Uh oh!

tonistiigi commented Aug 9, 2016

Uh oh!

KJTsanaktsidis commented Aug 9, 2016

Uh oh!

KJTsanaktsidis commented Aug 18, 2016

Uh oh!

KJTsanaktsidis commented Aug 20, 2016

Uh oh!

icecrime commented Aug 23, 2016

Uh oh!

KJTsanaktsidis commented Aug 25, 2016

Uh oh!

KJTsanaktsidis commented Aug 25, 2016

Uh oh!

tonistiigi Aug 29, 2016

Choose a reason for hiding this comment

Uh oh!

tonistiigi commented Aug 29, 2016

Uh oh!

tianon commented Aug 29, 2016

Uh oh!

stevvooe Aug 30, 2016

Choose a reason for hiding this comment

Uh oh!

stevvooe commented Aug 30, 2016

Uh oh!

tonistiigi commented Aug 30, 2016

Uh oh!

KJTsanaktsidis commented Aug 30, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KJTsanaktsidis commented Aug 30, 2016

Uh oh!

KJTsanaktsidis commented Aug 30, 2016

Uh oh!

aaronlehmann commented Aug 30, 2016

Uh oh!

tonistiigi commented Aug 30, 2016

Uh oh!

aaronlehmann commented Aug 30, 2016

Uh oh!

tonistiigi commented Aug 30, 2016

Uh oh!

stevvooe commented Aug 30, 2016

Uh oh!

tonistiigi commented Aug 30, 2016

Uh oh!

tonistiigi commented Sep 19, 2016

Uh oh!

KJTsanaktsidis commented Jul 16, 2016 •

edited by coolljt0725

Loading

KJTsanaktsidis commented Aug 30, 2016 •

edited

Loading