Skip to content

Proposal: using registry images for build caching #24711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

KJTsanaktsidis
Copy link

@KJTsanaktsidis KJTsanaktsidis commented Jul 16, 2016

- What I did
In this PR, I propose a plugin mechanism for the daemon to request
cached images from a plugin. I believe a plugin is appropriate here
because there are several different ways such sharing might work
depending on the lifecycle of the machines being used.

- How I did it
N/A

- How to verify it
N/A

- Description for the changelog
Proposal: plugin mechanism for sharing image cache between machines

- A picture of a cute animal (not mandatory but encouraged)
sant01-10a-penguinonseal

As has been discussed extensively in #20316 , Docker now no longer
restores parent image chains when pulling from a registry, and people
who relied on this to cache builds across multiple machines have had
their build times blow out.

I'm PR'ing this to get some feedback from the maintainers & community on
this idea. If people agree that this is the way forward, I'd like to
start working on this. It doesn't appear to me that the implementation
of the plugin surface would be especially difficult or disruptive - but
I want your thoughts on this :)

EDIT: correct the issue link

@GordonTheTurtle GordonTheTurtle added status/1-design-review dco/no Automatically set by a bot when one of the commits lacks proper signature labels Jul 16, 2016
@thaJeztah thaJeztah added area/distribution kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny labels Jul 16, 2016
@coolljt0725
Copy link
Contributor

@KJTsanaktsidis I think you link the wrong issue number in your post, I think it's #20316 not 23016. I corrected it, please correct it if I'm wrong -:)

* People using disposable VM's in development who rebuilt docker images when the VM came up

## Image save/load based solution
Following the discussion about this issue in #23016, #21385 was merged which kept parent references when exporting/importing images with `docker save` & `docker load`. Instead of doing a registry pull prior to a build, this functionality allows saving a built image and loading it onto the next machine to build that image, roughly emulating the behaviour of the previous push/pull workaround. However, this solution has some problems (some of which were also issues with the push/pull workaround) -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/23016/20316

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops - yeah that one. People at work joke that I'm allergic to copy/paste...

@KJTsanaktsidis KJTsanaktsidis force-pushed the image_cache_plugin_proposal branch from fa3f086 to 7430401 Compare July 19, 2016 10:05
@GordonTheTurtle GordonTheTurtle removed the dco/no Automatically set by a bot when one of the commits lacks proper signature label Jul 19, 2016
@KJTsanaktsidis KJTsanaktsidis force-pushed the image_cache_plugin_proposal branch 2 times, most recently from 8be38c9 to fcc955a Compare July 19, 2016 10:09
@thaJeztah
Copy link
Member

ping @tonistiigi @stevvooe PTAL

@tonistiigi
Copy link
Member

@KJTsanaktsidis https://github.com/tonistiigi/buildcache is an incomplete but completely functional tool to save build cache for docker v1.11+ that doesn't have most of the problems listed in your description of load/save method. The cache file is small(around 1KB) and doesn't have much overhead.

Regarding your interface example, I'm not sure I understand how this would work. How would a plugin get this data so it can make a decision like this. We do not want to implement alternative distribution scheme of images. Also, it's impossible for a plugin to find an image based on just runconfig(unless it only looks local images like atm), if it did that we would expose the security problem again as anyone can fill this cache with the same runconfig but completely different image. The cache can only come from a trusted location for a specific repo.

@stevvooe
Copy link
Contributor

stevvooe commented Aug 4, 2016

I am generally confused by this proposal. Cache poisoning issue is a real problem. The worst part about is that it is not at all obvious that you are affected or protected. It can only be mitigated by limiting the horizon of data that one trusts.

Anything that circumvents that protection, even docker save/load, is going to open your infrastructure up to injection of malicious content. This proposal, and none of the proposals in #20316, address this problem. While we all want fast builds (and I really do), introducing cache poisoning to the build step of the infrastructure. Could you imagine the impact if someone could just inject a malicious layer into library/ubuntu or library/alpine?

The other aspect to this is the misapplied assumption about the idempotence of shell commands. apt-get update run twice is never guaranteed to have the same result. Ever. That is just not how it works. If you have a build cache that is never purged, you will never update your upstream software. That may or may not be the intent. Even worse, if this build cache gets filled in with remote data, you probably have no visibility into when that command was run.

That said, I recognize that people rely on this behavior and there are environments that can apply some amount of trust to a distributed build cache.

Docker now no longer
restores parent image chains when pulling from a registry, and people
who relied on this to cache builds across multiple machines have had
their build times blow out.

Why don't we just fix this rather than piling on a bunch of extra code? Furthermore, why don't we leverage the image registry in the process?

Let's look at how we build an image, with FROM alpine at the top:

docker pull mysuperapp:v0
docker build -t mysuperapp:v1 .

In this simple case, we cannot assume that a remote mysuperapp:v0 and the ongoing build are related, since that would possibly introduce the cache poisoning scenario that we need to avoid. However, one may have local registry infrastructure that they know they can trust. While we can infer parentage (despite other assertions, this is still possible), we may not be able trust that parentage from a build caching perspective from all registries. But, this build environment is special.

What better way than to tell the build process process that you can trust a related image?

docker build --cache-from mysuperapp:v0 -t mysuperapp:v1 .

The above would allow Dockerfile commands to be satisfied from the entries of mysuperapp:v0 in the build of mysuperapp:v1. Job done!

No! We still have a problem. Now, my build system has to know tag lineage (mysuperapp:v0<mysuperapp:v1`). Let's modify the meaning the tagless reference to mean something slightly different:

docker pull -a mysuperapp
docker build --cache-from mysuperapp -t mysuperapp:v1 .

In the above, we pull all the tags from mysuperapp, any layer of which can satisfy the build cache. In practice, this probably is a little wide for most scenarios, so we can allow multiple --cache-from directives on the command line:

docker build --cache-from mysuperapp:v0 docker build --cache-from mysuperapp:v1 -t mysuperapp:v2 .

There are many possibilities here to make this more flexible, such as running a registry service specially for the purpose of build caching mybuildcache.internal/mysuperapp. Did you know that you can just run a registry and rsync the filesystem around without locking? You can also rsync from multiple sources and merge the result safely (kind of). Such a registry can be purged periodically (or some one could submit a PR to purge old data).

We can take this even further, but I hope the point is brought home. This is probably less convenient that the original behavior, but it is a good trade off. It leverages the existing infrastructure and has the possibility of being extended as use cases change.

@tonistiigi
Copy link
Member

@stevvooe v2/schema2 manifests have a single image configuration, so you don't get a chain of configurations on pull/push.

@tonistiigi
Copy link
Member

Discussed more with @stevvooe about the --cache-from proposal and I think it should be possible to implement that even without having the parent chain locally and only using the array of history items in image configurations. That would mean redo of our cache mechanism to not only compare image configurations but compare the Dockerfile commands directly as well. We would always generate a new image but if we can match a command that changed the rootfs we can directly use the existing layer array instead. Definitely seems worth investigating further and maybe put together a poc.

ping @aaronlehmann @dmcgowan

@dmcgowan
Copy link
Member

dmcgowan commented Aug 8, 2016

I agree with proposed addition of --cache-from, it is simple and I think in correct solution here would require an explicit flag.

Now, my build system has to know tag lineage (mysuperapp:v0<mysuperapp:v1`). Let's modify the meaning the tagless reference to mean something slightly different:

Not sure diverging from the current expected behavior could be done here. The command should be able to pull the cached from image and doing the equivalent of pull -a automatically would not be ideal for the unexpecting user and has the potential to slow down builds much more. I would like to hear from user's what the challenges may be in figuring out the build cache tag since there is a variety of tagging strategies.

In terms of the POC implementation, already mentioned this should pull the cached image if it does not exist to be consistent with other commands. This would also need to do a trusted reference lookup using Docker Content Trust. Being able to do that makes this method much more secure and reliable than trying to resolve the cache in a daemon side plugin.

@aaronlehmann
Copy link
Contributor

I like the --cache-from proposal. Definitely in favor of putting together a proof of concept.

I agree with @dmcgowan that pulling all tags as a result of a --cache-from argument may be unexpected behavior that slows down builds more than it helps. It may be better to only pull the latest tag in when no tag is given, and allow users to explicitly use pull -a to warm the build cache when appropriate.

@KJTsanaktsidis
Copy link
Author

@stevvooe - regarding the idempotency of shell commands, that's already an issue with local build caches. People already have to deal with this by, for example, explicitly specifying versions they want, or running every nth build with --no-cache

@tonistiigi - I definitely agree that having to implement an alternate distribution scheme for images is a bit of a drag, but I also didn't want to turn the registry into something it's not. If we can come to a solution that leverages the registry as-is, then that would definitely involve substantially less code overall (between Docker & any hypothetical plugins) and I'd be all for it.

@tonistiigi - regarding your plan for implementing --cache-from: Can I see if I've understood this correctly?

  • At the moment, Docker finds cached images by finding images who's parent is the previous build step and who share the same runconfig as the next step
  • The registry does not store parent image chains any more, but it does store a list of sha256 hashes of all the filesystem layers built up to create the image which has been tagged in the registry
  • You propose that instead of finding images as part of cache resolution, we try and find layers. To do this, we would compare the parent layer hash to the previous build step, and the Dockerfile command that generated the layer with the currently building Dockerfile.

What I don't understand is how we would get the Dockerfile command from the registry to compare. I poked around at the HTTP response you get from GET /v2/image_name/manifests/latest, and the Dockerfile information is only available in a field called "v1Compatability", which doesn't seem promising. How would we get that information?

@tonistiigi
Copy link
Member

@KJTsanaktsidis I guess at least initially we would use both methods so we don't break people. We only use the new method if --cache-from is set and it points to an image with no parents. Dockerfile command is listed in v1Compatibity field if you look at the schema1 compatibility manifest. Since v1.10/registry2.3 there is a history array in the image configuration that includes the Dockerfile commands. You can pull the manifest with this config with Accept: application/vnd.docker.distribution.manifest.v2+json header and then the config using the blobstore endpoints.

@KJTsanaktsidis
Copy link
Author

Right - I wasn't sure if access to the v1Compatibility field would be frowned upon/deprecated. If that's data we're allowed to access then that all sounds good to me.

I'll try and find some time to update the proposal this week based on this discussion. Thanks a bunch for all your input!

@tonistiigi tonistiigi mentioned this pull request Aug 17, 2016
@KJTsanaktsidis
Copy link
Author

Hey, just an update- I am still planning to work on this but I didn't manage to squeeze it in before I went away. I'm on holidays now for a couple of weeks but I will keep working on this proposal when I get back. Sorry to break the momentum!

@GordonTheTurtle GordonTheTurtle added the dco/no Automatically set by a bot when one of the commits lacks proper signature label Aug 20, 2016
@KJTsanaktsidis KJTsanaktsidis force-pushed the image_cache_plugin_proposal branch from 5aa0eef to 40eef94 Compare August 20, 2016 09:30
@GordonTheTurtle GordonTheTurtle removed the dco/no Automatically set by a bot when one of the commits lacks proper signature label Aug 20, 2016
@KJTsanaktsidis
Copy link
Author

@tonistiigi @stevvooe PTAL - I've rewritten the proposal to be a proposal on how to implement --cache-from. I'll try and get a prototype done this week but we'll see how we go.

@KJTsanaktsidis KJTsanaktsidis changed the title Proposal: plugin mechanism for sharing image cache between machines Proposal: using registry images for build caching Aug 20, 2016
@icecrime
Copy link
Contributor

@KJTsanaktsidis Thanks for the help here! The --cache-from proposal seems really valuable, and feels like something we should really try to get in for 1.13.0. Can you please let us know if you find time working on it? If not, we might ask one of the maintainers to take it over, and we'll ping you for review/feedback.

@KJTsanaktsidis
Copy link
Author

@icecrime I've been hacking at a prototype this week while I've been away, and have mostly got all the tentacles in the right place and had a good go at the core algorithm too. I'll try and finish it on the plane this weekend and push it up here for review :)

@KJTsanaktsidis
Copy link
Author

I think I can commit to implementing it and getting it merged though - I'll discuss with work on Monday and confirm this.

}

// Pluck out the newest layer from the potential matches
// Pointer so it can be nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of newest we should probably use the order specified by the user in cache-from.

@tonistiigi
Copy link
Member

ping @tianon I remember you had an issue with how we stringify the command and potentially lose information during it. If you still see it as a problem then now seems a good time to address it before we make the build cache rely on this stringification.

@tianon
Copy link
Member

tianon commented Aug 29, 2016

Yeah! The discussion over in #22436 has some of the details of what I find funky about it. The way "build args" are stringified is definitely a pretty massive issue, since the formatted array we used to output (prefixed with |N followed by N number of array items) is now flattened and thus no longer parses correctly (ie, ["|2", "a=b c=d", "e=f g=h"] now stringifies the same as ["|2", "a=b", "c=d e=f g=h"] -- ie, a build-arg whose name is a and value is b c=d, etc, as one concrete example of where the current processing breaks down).

Additionally, I think it's kind of strange that we still use the # (nop) syntax when we could just re-create the raw Dockerfile instructions as-is. For example, RUN echo hi ought to translate directly to a "comment" for that line as RUN echo hi IMO (but the same for command which only modify metadata without creating a container, such as CMD ["a", "b"] translating to CMD ["a","b"] or something instead of sh -c # (nop) CMD [a b] or whatever it does currently).

// ImageCacheForBuild represents a stateful image cache that is wound forward
// at each step of the build process, until either there is a cache miss or
// the image is built.
type ImageCacheForBuild interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be best to retain the naming of this interface and then add a ImageCacheBuilder or something.

@stevvooe
Copy link
Contributor

All in all, this is looking promising.

#24711 (comment) has me concerned. Images should not be created as a result of using --cache-from. The cache items satisfying --cache-from should only be valid for the duration of the build. Using that parameter should result in any new images being created.

@tonistiigi
Copy link
Member

@stevvooe Unless we diverge from "every Dockerfile command creates and image/every command executes and then commits" rule of our builder, we have to create intermediate images because they just don't exist after pull. Problem atm is that the cache function itself creates this image what isn't very clean. Although moving this creation to builder pkg where it would happen on cache miss would add quite a lot of complexity.

@KJTsanaktsidis
Copy link
Author

KJTsanaktsidis commented Aug 30, 2016

In general I kind of like the idea of not creating images that will serve no actual purpose after the build. Unfortunately, the way the builder is written at the moment kind of depends on the whole "Image + Dockerfile step -> Image" workflow.

I can see an argument that building a Dockerfile should not create an intermediate image at each step. However that change seems pretty orthogonal to how this works, especially once I factor out the relevant code between this and daemon.Commit. Both the build and the cache steps will then be using it, and we could make it do something else later.

@KJTsanaktsidis
Copy link
Author

Thanks everybody for the really great feedback. I'll come around at this again in the next couple of days with an improved version taking all of the above into account.

@KJTsanaktsidis
Copy link
Author

@tianon If we plan to make the CreatedBy in the history just the raw Dockerfile command, I can drill that through and check both?

@aaronlehmann
Copy link
Contributor

Unless we diverge from "every Dockerfile command creates and image/every command executes and then commits" rule of our builder, we have to create intermediate images because they just don't exist after pull.

It does seem like a problem because these images would be seen by other builds that don't use --cache-from, unless I'm missing something. Altering the global build cache seems like an unexpected side effect, with potential security implications.

@tonistiigi
Copy link
Member

@aaronlehmann What do you expect to happen when you use --cache-from and it doesn't match anything? I'd expect the build to behave the same way as it would if I didn't specify --cache-from.

@aaronlehmann
Copy link
Contributor

That seems fine, but I'm more concerned about the case where it does match something. That would create "fake" images that get matched by other builds that aren't using --cache-from.

@tonistiigi
Copy link
Member

@aaronlehmann I'm not sure I get why this is different or why these images are "fake". Both cases you get intermediate images. Only difference is if the layer inside an intermediate image is shared with some other image or not.

@stevvooe
Copy link
Contributor

@tonistiigi Let's put it this way: does --cache-from change the state of the engine, such that images that weren't previously in the cache are now present? Will subsequent calls to build not require --cache-from to get the same effect?

If we can't avoid this, we'll need to have a separate endpoint that populates the build cache and modifies the entire engine, such that builds after that call will use the new cache entries.

@tonistiigi
Copy link
Member

does --cache-from change the state of the engine, such that images that weren't previously in the cache are now present?

Every build creates cache for all the Dockerfile commands. Independently from if --cache-from was specified or if any commands got a cache hit or not. Atm we don't seem to be talking about modifying our current build cache methods or removing the concept of parent chain. Maybe this is also something we want to do in the future but I see these as totally independent questions. Current build cache method is that any locally built image always matches the cache if --no-cache wasn't specified. --cache-from itself doesn't change any engine state, building a new image does.

@tonistiigi
Copy link
Member

@KJTsanaktsidis Are you still working on this? It's ok if you want us to carry this PR as we hope to get it into v1.13.

@KJTsanaktsidis
Copy link
Author

@tonistiigi Unfortunately not - I tried to hassle management at work the other day for a week or so to get this mergeable, but it didn't fly. Heaps of operations issues to deal with at the moment... So I think I'm going to have to hand this over to you. I would like to thank you guys heaps for talking this over with me, I learned heaps about how Docker is put together.

@tonistiigi
Copy link
Member

opened #26839

@tonistiigi tonistiigi closed this Sep 22, 2016
@icecrime
Copy link
Contributor

Thank you very much for your help @KJTsanaktsidis! You didn't pick the easiest problem, and you did wonders ;-) Tonis will carry but preserve your original commits so you still get the credit 👍

tonistiigi pushed a commit to tonistiigi/docker that referenced this pull request Sep 23, 2016
This commit adds a proof-of-concept implementation of the --cache-from
feature outlined in moby#24711. The parts that have been implemented so far
include:

* Accepting `--cache-from` as an argument to `docker build`, and passing
  it through to the builder
* Changing the ImageCache interface to be based on a new context per
  build, so that the cache can track build-specific things like "have I
  pulled this `--cache-from` image yet?"
* Added the ability to use layers from registry images specified as
  `--cache-from` to the Daemon (conforming to the ImageCache interface).
  If a layer is found that can be arrived to by the same sequence of
  commands that the current build has run, it can create a new container
  to form part of the build's parent chain using the existing layer and
  without needing to run the command.

Signed-off-by: KJ Tsanaktsidis <kjtsanaktsidis@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/distribution kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny status/1-design-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.