Skip to content

Conversation

deitch
Copy link
Contributor

@deitch deitch commented Apr 26, 2022

This is an early draft. As discussed with @tonistiigi and @sipsma , this extends the source/containerimage/ so it can support "pulling" from 2 distinct places: an actual registry (as now) and an OCI layout. In addition to the docker-image: scheme, it also recognizes and oci-layout: scheme.

It uses the same source/containerimage/ directory, just adding a typed constant so that it can choose if it is oci-layout from a directory, or from a registry. It then just selects which Pool.Resolver to use.

Push() is disabled.

Some things I haven't yet addressed:

  • buildkit often runs inside a container, so how is the provided directory mapped? This already had to be dealt with for local: scheme, so I will look there as well.
  • it is a little "conflicted" in that it expects a @sha256:<hash> but also knows how to read the OCI layout index.json. I will need to pick supporting image names and having hashes optional, or requiring hashes and eliminating name resolution.

At this point, just ready for overall directional input.

case SourceRegistry:
p.Puller.Resolver = resolver.DefaultPool.GetRegistryResolver(p.RegistryHosts, p.Ref, "pull", p.SessionManager, g).WithImageStore(p.ImageStore, p.Mode)
case SourceOCILayout:
p.Puller.Resolver = resolver.DefaultPool.GetOCILayoutResolver(p.layoutPath, p.Ref, "pull", p.SessionManager, g).WithImageStore(p.ImageStore, p.Mode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no pooling concept for this OCI resolver. Should directly implement object that implements the Resolver with the Fetcher method. No other methods or inputs are needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better. That will make the whole GetOCILayoutResolver concept much simpler. I will update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Completely in parallel here with the source. We can move it to a util, if you want.

SessionManager: sm,
vtx: vtx,
}
case SourceOCILayout:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this duplicated? Looks very similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but some of the parameters are different. I planned to try to get it more concise, but didn't care too much for first review of general direction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to get it a bit more DRY.

}

func NewOCIIdentifier(str string) (*OCIIdentifier, error) {
ref, err := reference.Parse(str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This input isn't necessarily a reference. It should be a name that the client exposes and that can be pulled in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is a good point. It should be a path to somewhere, like oci-layout://var/layouts/setup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@@ -6,4 +6,5 @@ const (
LocalScheme = "local"
HTTPScheme = "http"
HTTPSScheme = "https"
OCIScheme = "oci"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oci-layout might be better. docker-image can also be OCI mediatype image.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have --output type=oci, so oci might be fine.

Or we might want to consider renaming --output type=oci to --output type=oci-layout too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or oci-archive maybe

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Atm. this string isn't even visible to the user so some verbosity should be fine in here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--output type=oci gives an OCI tarball. It is the same format, except in a tarball. Which leads to the question of whether we should support tarball input as well as directory (and maybe directory layout in addition to tarball).

It is much easier to do tarball, because it is a stream, no need to worry about access to local directories.

I don't think they all need to be solved now, but we should get the nomenclature correct so we have options in the future without breaking things.

We can keep just a single scheme/type oci (or oci-layout, if we prefer) for all of it, with the below:

  • source directory: oci://path/to/layout/dir
  • source tar file: oci://path/to/tarball.tar (future, would depend on if it is a file of type tar or not)
  • source tar stream: oci:// or oci://- or even oci://stdin
  • output directory: --output type=oci,dest=path/to/layout/dir
  • output tar file: --output type=oci,dest=path/to/output.tar
  • output stdout: --output type=oci

The only really weak point there is output directory, as it might not exist, so how do we know if they want a tar file or a layout directory? We could say, "if dest ends in .tar, we will write a tar file, else a directory, and if you really want to write to a tar file that does not end in .tar, then write to stdout and save the output as you wish.

@@ -27,7 +27,7 @@ import (

type Puller struct {
ContentStore content.Store
Resolver *resolver.Resolver
Resolver resolver.Resolver
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need to change it to remotes.Resolver. Or if image requires resolver.Resolver (not sure atm where) then can add remotes.Resolver as a second option and make oci-layout use that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an interim step, I created an interface in the same package that both util/resolver/pool.Resolver struct and source/containerimage/OCILayoutResolver struct satisfy. It should not be final, but lets us focus it in one place (and out of the pool resolver package), whence we can clean it up.

The only difference between what our implementations of Resolver do and remotes.Resolver are:

type Resolver interface {
	remotes.Resolver
	HostsFunc(host string) ([]docker.RegistryHost, error)
	// WithSession returns a new resolver that works with new session group
	WithSession(s session.Group) Resolver

	// WithImageStore returns new resolver that can also resolve from local images store
	WithImageStore(is images.Store, mode source.ResolveMode) Resolver
}

I have not figured out yet where/how that gets used. If we can get around it, this whole thing does simplify to remotes.Resolver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, it doesn't quite work, because those funcs return *util/resolver.Resolver (pointer to a struct) and not util/pull.Resolver (interface). It satisfies the interface requirements, but that doesn't make the signature identical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not figured out yet where/how that gets used. If we can get around it, this whole thing does simplify to remotes.Resolver.

So:

  • HostsFunc is not used externally, so that is easy
  • WithImageStore only ever is used by something that knows it is calling resolver.Resolver or resolver.OCILayoutResolver, so that is easy
  • WithSession is the one that gets called from within util/pull.

I still had to change it a bit, including util/resolver, but minimally to comply.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithSession is used for attaching authentication. Do not need it as you are not providing credentials for accessing OCI layout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should make it easier. But it is called here, from within util/pull/pull.go, as part of PullManifests().

Actually, it is used inside a func as a property of PulledManifests:

	return &PulledManifests{
		Ref:              p.ref,
		MainManifestDesc: p.desc,
		ConfigDesc:       p.configDesc,
		Nonlayers:        p.nonlayers,
		Descriptors:      p.layers,
		Provider: func(g session.Group) content.Provider {
			return &provider{puller: p, resolver: p.Resolver.WithSession(g)}
		},
	}, nil

I cannot quite figure out where that is consumed. Either way, is it not easier to just have a WithSession() where the remote registry puller cares, and the local registry ignores it?

Copy link
Member

@tonistiigi tonistiigi Apr 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to get the provider directly from content. You might need to pass in "registry-mode" bool or smth like this so you know which type to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get the provider directly from content

I don't understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you have an implementation that implements Resolver (or Fetcher or Provider) based on client side contents that is all you need to get content.Provider out of it.

There is an additional question in here about do we want the lazy pull behavior for these layers. For registry images we avoid pulling the layers until the content is really needed(this is why there is this callback here). For OCI case I think the lazy behavior doesn't make quite sense and may cause more harm than benefits. Still I think this pullmanifest may remain as is, with just a different content.Provider callback implementation but we might want to just call Unlazy() for this reference directly before returning from Snapshot() @sipsma

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonistiigi I think I get what you are after. Take a look at the latest. I made Puller reference just a remotes.Resolver:

type Puller struct {
	ContentStore content.Store
	Resolver     remotes.Resolver
	Src          reference.Spec
...
}

and then PullManifests() takes an arg to get a resolver passing it a session. This allows us to keep it all within containerimage/pull.go, where it knows if it is OCI or registry, and therefore can call WithSession() on registry, or ignore it on OCI.

Still a bit convoluted, but much cleaner than before.

Is this what you were suggesting?

@@ -2,23 +2,36 @@ package resolver

import (
"context"
"encoding/json"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not make changes to this file. This package is for registry connection pooling. It doesn't have anything to do with OCI layout parsing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, once you made it clear above that pooling is only for that, no big deal.

@@ -115,8 +135,33 @@ func newResolver(hosts docker.RegistryHosts, handler *authHandlerNS, sm *session
return r
}

func newOCILayoutResolver(layoutPath string, sm *session.Manager, g session.Group) *OCILayoutResolver {
fileStore, err := local.NewStore(layoutPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea what this is supposed to be. I can't understand what is passed in here and the PR does not seem to even build. You need to pull in the blobs from the client side through a session API that is exposed by the client. Not allowed to read any files from host disk based on user input. Also local.NewStore is not OCI layout but containerd contentstore boltdb layout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the PR does not seem to even build

I didn't expect it to. This is just for design review, to see that we are moving in the right direction and take course corrections.

No idea what this is supposed to be. I can't understand what is passed in here

I largely copied what the existing resolver from the registry does. Instead of a ref, I passed where the expected oci-layout directory is.

Also local.NewStore is not OCI layout but containerd contentstore boltdb layout.

I know. But it happens to be oci-layout compliant except for the stuff that goes in boltdb (image to hash mapping, etc.). This way, we inherit the reading of hashes from containerd's library. I could write it from scratch, but why bother? It does all of the reading, validation, etc. And buildkit already imports containerd packages.

You need to pull in the blobs from the client side through a session API that is exposed by the client. Not allowed to read any files from host disk based on user input.

This is precisely what I punted on (see the PR opening comment). I wasn't sure how we do that, as buildkit runs potentially on a different host than the client.

How does local source do that? Does the client recognize the option, mount the directory, and stream it to buildkitd? Can you share a link to consuming that session API?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client exposes service on session with s.Allow https://github.com/moby/buildkit/blob/master/client/solve.go#L127 . Daemon side gets the caller for it https://github.com/moby/buildkit/blob/master/source/local/local.go#L98-L101 .

For the contentstore this already exists for local cache import https://github.com/moby/buildkit/blob/master/client/solve.go#L161 so basically you need to attach your contentstore to this same map. Then pass the sessionID and digest with the LLB definition so the daemon side can provide it as input when accessing the session service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I get this, let's try.

  • when the client sets up grpc to the daemon, the internal structure that represents that communication is a session
  • the client "allows" various client-side services to be accessed by the daemon via s.Allow(service), where the filesystem sync service (aka "get things from client filesystem") is "allowed" here
  • anything daemon side that wants to consume one of those services, e.g. local or our new OCI needing access to client-side filesystems, needs to get a caller for that session and service, e.g. local gets the specific session based on ID here, and then calls ls.snapshot() using that session here, which appears to do all of the syncing up.

A lot still isn't clear to me:

  • why do we have both this s.Allow() and this one? One uses a specific map, another syncedDirs, which seems to come from opt.LocalDirs. Is one a general list of dirs mapped in (like docker run -v ...) and the other ones that come from options (like --build-context= or --cache-export)?
  • somewhere, it must parse the options (e.g. --build-context), find which are local (right now just filesystem path, but also oci) and add them to the cacheOpt.contentStores; where does it do that, so I can add the build-context oci://?
  • once this is in place, then the OCIResolver is just going to need to handle the session and resolve content from there? Does the filesync just allow daemon-side processes use session to list files, walk directories, and read content via the session almost as if it was local?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got some of the daemon-side. I haven't figured out how to read the data, but the setting up of the session I just took from local and adapted.

I still need to figure out how to get an io.ReadCloser for the file I want from the session. It looks like it has to do with filesync.FSSync() like here for local? Not sure.

If so, that looks like it syncs everything from the remote to the local in some tmp dir. Is there any way to ask it, "here is a path, stream me the file contents"? Is there an example of that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help explain the above @tonistiigi ? I hit another wall.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I just need to create a content.Store that accesses the local filesystem path and pass it?

yes

I don't see documented anywhere how the docker buildx concept of build-context translates into something that buildkit understands.

It's in Dockerfile frontend. Read #2827 (comment)

what does the --build-context foo=oci-layout://... get translated to in terms of buildkit options? What does it look like?

Client converts the local directory to contentstore(and finds the target digest). Add OCIArchives map[string]content.Store to https://pkg.go.dev/github.com/moby/buildkit@v0.10.2/client#SolveOpt . In FrontendOpts/FrontendAttr buildx would set set context:foo=oci-layout://contentstoreID/image-digest. Handler for the FrontendOpts in Dockerfile should be in a later PR(and --build-context is in buildx). Only LLB support for now.

where does that get parsed, such that I can extract the path and pass the local content store into contentStores?

Only buildx implements this flag. If we want buildctl support then the flag for that would be something like --oci-archive digest=./path/to/store similar how buildctl exposes --local. But for me just Go LLB library support for this is ok for now.

where does the image name side of it

No idea what you mean in here. Maybe answered above.

how does it translate the image name into the root manifest? An OCI layout can have many manifests.

This is for the client side to decide when it parses --build-context value. For the daemon, the only inputs are contentStoreID and digest. Multiple digests could come from same contentstore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to give this a shot @tonistiigi ; tell me what I got right and wrong.

  • the daemon is responsible for the build itself. It doesn't understand dockerfile or other frontends; it gets an LLB via a SolveRequest and processes that.
  • the client is responsible for parsing the frontend (e.g. Dockerfile) into an LLB, creating that SolveRequest and sending it to the daemon (as well as awaiting status updates and opening the channel to send requested info by the daemon).

The process looks like:

  1. client starts a build with Dockerfile (or other frontend) and context(s) and options
  2. client (built-in or via some other frontend processor) processes the frontend, getting to an LLB, which includes some sources
  3. sends the SolveRequest to the daemon, which includes each source's, well, "source"

In practical terms, this means that if I have the following:

FROM golang:latest as builder
RUN something
FROM alpine:3.15
COPY --from=builder /else /here

The determining of builder as based on the image golang:latest happens on the client-end; buildkitd doesn't know or understand any of that. The client using a frontend processor determines "I need to resolve golang:latest and alpine:3.15", and the client figures out how to resolve each of those. It just passes each of those as a source. In the case of an image, that source will be a registry reference; in the case of local, it will be give it some reference that says, "go request this reference thing from the channel I opened with you."

Translating all of that to OCI, the client:

  1. calls the frontend processor (unchanged)
  2. sees that one of the sources is golang:latest (unchanged)
  3. knows (based on its CLI flags or whatever) that golang:latest should be converted to a hash in a local content.Store (new)
  4. sends the LLB to the daemon, along with the source for that particular element as "go request this reference thing from the channel I opened with you."

For that to work:

  • the daemon needs to know that it has a new kind of source and how to get it (similar to local: it just reaches back on the channel to the client; similar to registry: everything is hashes and manifests)
  • the client needs to know, when its frontend processor has an image that matches some indication (i.e. new context given), that it should find the local directory, open a local content.Store and make it available via the channel

Everything we did in this PR to date was about the first half of that: getting the daemon to recognize a new source reference that the client will pass it in its LLB; everything in the follow-on here is about getting the client to know how to find those source and create the correct reference to pass to the daemon.

If that is correct, let me know and I will try to interpret your above comments in that light.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clients sends a Solve or Build request to build using a Dockerfile frontend. Dockerfile frontend runs inside a container, parses the Dockerfile, generates LLB and issues LLB Solve requests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buildkitd runs in one container; frontend can be in another. So client issues solve requests to frontend container, gets response as a whole bunch of LLB Solve requests, and then passes those to buildkitd?

@deitch
Copy link
Contributor Author

deitch commented Apr 27, 2022

Interim update:

  • need decision on the scheme name
  • need some direction on sourcing a path from the client via session API (similar to local source`)
  • does client via session API require changes to the client CLI part?
  • waiting for feedback on the above issues inline
  • other comments addressed

@deitch deitch force-pushed the oci-context branch 3 times, most recently from c2eb2d2 to b2289ea Compare May 3, 2022 09:18
@deitch
Copy link
Contributor Author

deitch commented May 12, 2022

@tonistiigi definitely getting closer, with a nice assist from @giggsoff, who made some really strong suggestions. I updated this branch to show it. Take a look.

We now have the s.Allow() passing the oci layout map[string]content.Store, and we have it loading those stores.

But I still don't get where some of this is passed to the daemon-side builder, and how --opt context:foo=oci-layout:/some/path/to/layout@sha256:d548508ba9e090d5685a02cfde85cb4915604bd18de65937cae962bbc15ffa96 gets processed. It looks like it has to do with contextByName(), which is part of the dockerfile frontend.

That would mean that it is the frontend that handles these aliasing conversions ("I got foo, so I give the builder something that says, 'take it from the client with this identifier oci-layout://some/path/to/layout@sha256:d548508ba9e090d5685a02cfde85cb4915604bd18de65937cae962bbc15ffa96").

Is it that identifier that gets passed from client to buildkitd, and that buildkitd then passes back to the client on the channel?

For the daemon, the only inputs are contentStoreID and digest. Multiple digests could come from same contentstore.

I didn't see an actual contentStoreID anywhere.

There is some glue here still missing.

@@ -452,6 +452,45 @@ func Differ(t DiffType, required bool) LocalOption {
})
}

func OCILayout(name string, opts ...OCILayoutOption) State {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This signature is wrong #2827 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IS it? I looked at your comment above, you suggested:

func OCILayout(contentStoreID, digest string, opts ...OCILayoutOptions)

But in the end, all it does is return a call to NewSource, whose signature is:

func NewSource(id string, attrs map[string]string, c Constraints) *SourceOp {

So it is just going to combine the contentStoreID and hash anyways, isn't it? What do I gain by:

func OCILayout(contentStoreID, digest string, opts ...OCILayoutOptions)
...
	source := NewSource("oci-layout://"+contentStoreID+"@"+digest, attrs, gi.Constraints)

Over just:

func OCILayout(ref string, opts ...OCILayoutOptions)
...
	source := NewSource("oci-layout://"+ref, attrs, gi.Constraints)

?

client/solve.go Outdated
@@ -178,6 +179,11 @@ func (c *Client) solve(ctx context.Context, def *llb.Definition, runGateway runG
opt.FrontendAttrs[k] = v
}

// if OCILayout content.Store options are defined, make them available to the builder
if len(opt.OCILayouts) > 0 {
s.Allow(sessioncontent.NewAttachable(opt.OCILayouts))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be connected with the store map used for local cache. Othewise you are registering the same service twice.

Copy link
Contributor Author

@deitch deitch May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah does s.Allow() only have one service of each type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think I got that. Have a look.

)

// ParseOCILayouts parses --oci-layout
func ParseOCILayouts(layoutDirs []string) (map[string]content.Store, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to skip this for now and concentrate on LLB tests. This value can't be just a dir name anyway. You need to find a specific image and its digest from that dir.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done and removed. Now I wonder, we added opt.OCILayouts as you suggested, but how do we then populate it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, completely ignored the --oci-layout option, focused on the --opt context:foo= path for now.

@@ -874,6 +874,13 @@ func contextByName(ctx context.Context, c client.Client, name string, platform *
st = &httpst
}
return st, nil, nil, nil
case "oci-layout":
ref := strings.TrimPrefix(vv[1], "//")
st := llb.OCILayout(ref,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is not ref. ContentStoreID and digest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get that. What do you mean by contentstoreID?

From the end user perspective, all I do is pass "alias image foo to an oci layout at path x with hash abc123", using opt context.

Internally, the client passes this to the frontend, which translates "foo" to a reference to an OCI layout path and hash (the root manifest for that image).

Where does this become an ID? How is it referenced, both in passing to remote components (frontend and buildkitd), and when buildkitd passes it back up the channel to the client?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contentstoreID is the key in the map[string]content.Store that is exposed on the client side. Unique key that is used in the session request https://pkg.go.dev/github.com/moby/buildkit@v0.10.3/session/content#NewCallerStore. Client side session handler receives the request and by ContentStoreID finds the correct content.Store and by digest the correct root object from that content.Store.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when client side sees, eg, 3 opts with context, each pointing to a different source, it gives each one some unique ID, which is passed on to the builder.

Is there an example of that being done? I figure that local must have it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok, now I see it. So the contentStoreID in the local case is just local:/path/to/store. This is what gets passed to the frontend to use in place of whatever image is aliases (e.g. foo), and what is sent to buildkitd, and in turn is then sent back to the client via the channel.

In the local case, a directory is enough, as it is just one thing, so local: (this is a local path) and <dir> (where to get it).
In the oci-layout case, we would need to indicate oci-layout: (this is an OCI layout) and <dir> where to get it and <hash> (what to extract from that layout. Roughly similar in concept to what we do with docker docker-image: followed by registry (where to get it) and ref (which to pull).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I got it entirely. See above with the signature.

@deitch deitch force-pushed the oci-context branch 7 times, most recently from 7bd1c29 to 190d329 Compare May 18, 2022 23:38
@deitch
Copy link
Contributor Author

deitch commented May 18, 2022

Hey @tonistiigi made some real progress here. I now am able to give it an --opt context:foo=oci-layout:///some/path@sha256:<hash> and it actually parses it correctly, creates the content.Store, goes to the client to request it, etc.

Then it gets stuck.

Client-side output, until I hit Ctrl-C:

[+] Building 70.6s (2/3)
 => [internal] load build definition from Dockerfile                       0.0s
 => => transferring dockerfile: 70B                                        0.0s
 => [internal] load .dockerignore                                          0.0s
 => => transferring context: 2B                                            0.0s
 => [context foo] OCI load from client                                    70.5s
 => => resolve oci-layout/dummy@sha256:61113c18488f3e4bc28b0ad4674ab39b8  70.5s
received SIGINT, stopping process (will not forward signal)

Daemon-side output:

DEBU[2022-05-18T23:36:47Z] session started
DEBU[2022-05-18T23:36:47Z] reusing ref for local: mgkpeas7x1kz6n6poyabke7bw  span="[internal] load build definition from Dockerfile"
DEBU[2022-05-18T23:36:47Z] reusing ref for local: ge6vpz70whdv9akg7i38dzs5s  span="[internal] load .dockerignore"
DEBU[2022-05-18T23:36:47Z] diffcopy took: 12.688042ms                    span="[internal] load .dockerignore"
DEBU[2022-05-18T23:36:47Z] diffcopy took: 17.947666ms                    span="[internal] load build definition from Dockerfile"
DEBU[2022-05-18T23:37:11Z] fetch                                         digest="sha256:61113c18488f3e4bc28b0ad4674ab39b8fce930e6fb2fe9a1a1f8f09aa2b3c6d" mediatype= size=1052
WARN[2022-05-18T23:37:11Z] reference for unknown type:                   digest="sha256:61113c18488f3e4bc28b0ad4674ab39b8fce930e6fb2fe9a1a1f8f09aa2b3c6d" mediatype= size=1052
DEBU[2022-05-18T23:37:22Z] fetch                                         digest="sha256:61113c18488f3e4bc28b0ad4674ab39b8fce930e6fb2fe9a1a1f8f09aa2b3c6d" mediatype= size=1052
WARN[2022-05-18T23:37:22Z] reference for unknown type:                   digest="sha256:61113c18488f3e4bc28b0ad4674ab39b8fce930e6fb2fe9a1a1f8f09aa2b3c6d" mediatype= size=1052
DEBU[2022-05-18T23:37:34Z] fetch                                         digest="sha256:61113c18488f3e4bc28b0ad4674ab39b8fce930e6fb2fe9a1a1f8f09aa2b3c6d" mediatype= size=1052
WARN[2022-05-18T23:37:34Z] reference for unknown type:                   digest="sha256:61113c18488f3e4bc28b0ad4674ab39b8fce930e6fb2fe9a1a1f8f09aa2b3c6d" mediatype= size=1052
DEBU[2022-05-18T23:37:48Z] fetch                                         digest="sha256:61113c18488f3e4bc28b0ad4674ab39b8fce930e6fb2fe9a1a1f8f09aa2b3c6d" mediatype= size=1052
WARN[2022-05-18T23:37:48Z] reference for unknown type:                   digest="sha256:61113c18488f3e4bc28b0ad4674ab39b8fce930e6fb2fe9a1a1f8f09aa2b3c6d" mediatype= size=1052
ERRO[2022-05-18T23:37:58Z] /moby.buildkit.v1.Control/Solve returned error: rpc error: code = DeadlineExceeded desc = failed to load cache key: no active session for jzfxz2gbk3wc9sdx1obhhtek3: context deadline exceeded
DEBU[2022-05-18T23:37:59Z] session finished: <nil>

It keeps trying to fetch the same one and timing out.

I am unsure if the issue is how I created fetchWithSession in source/containerimage/ocilayout.go, or how I create the client-side store that awaits the read request in client/solve.go.

Some pointers?

@deitch
Copy link
Contributor Author

deitch commented May 18, 2022

OK, some progress...

@deitch deitch force-pushed the oci-context branch 3 times, most recently from c556a5b to 05acc28 Compare May 19, 2022 00:20
@deitch
Copy link
Contributor Author

deitch commented May 19, 2022

I actually think this might be working!

@deitch deitch marked this pull request as ready for review May 19, 2022 00:20
@deitch
Copy link
Contributor Author

deitch commented May 25, 2022

@tonistiigi it appears to be working. I removed this from "Draft" to regular PR status. Can you review?

@tonistiigi
Copy link
Member

I should have time tomorrow to review but if you think it is functional then please start looking into tests. We will need integration tests in client package and because you do seem to have changed the Dockerfile also in this PR then Dockerfile tests are also needed. As I said before, I'm fine with leaving the Dockerfile part out of the first PR.

@deitch
Copy link
Contributor Author

deitch commented May 26, 2022

I think I got the lint errors now.

@deitch
Copy link
Contributor Author

deitch commented May 26, 2022

please start looking into tests

Makes sense. Do we have existing ones that I can extend or duplicate, as relevant?

We will need integration tests in client package

Can you point me to existing ones that I can extend?

and because you do seem to have changed the Dockerfile also in this PR then Dockerfile tests are also needed.

By "dockerfile", do you mean the dockerfile frontend processor? I am happy to add those as well, please point me in the direction of existing ones I can extend/duplicate-and-modify?

@deitch
Copy link
Contributor Author

deitch commented May 26, 2022

Hmm, I do not understand what failed in the CI. It looks like it has something to do with how it is pulling something from GHA cache?

@tonistiigi
Copy link
Member

I can do that, but that just gives the entire stack trace on the client side.

No, there is a trace from buildkitd as well as bin/dockerfile-frontend in there.

@deitch
Copy link
Contributor Author

deitch commented Jun 11, 2022

OK, down to 3 tests failing. Slowly, slowly.

@deitch
Copy link
Contributor Author

deitch commented Jun 11, 2022

Found that one too. Ready to kick off CI again.

@deitch
Copy link
Contributor Author

deitch commented Jun 11, 2022

Whoa! All CI passed! 🥳

@deitch
Copy link
Contributor Author

deitch commented Jun 14, 2022

@tonistiigi where do we stand with this? Does it need anything else?

if p.id.RecordType != "" && current.GetRecordType() == "" {
if err := current.SetRecordType(p.id.RecordType); err != nil {
if p.RecordType != "" && current.GetRecordType() == "" {
if err := current.SetRecordType(p.RecordType); err != nil {
return nil, err
}
}
Copy link
Member

@tonistiigi tonistiigi Jun 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sipsma PTAL #2827 (comment)

@deitch Not a blocker for this PR. We can do this in follow-up.

@tonistiigi tonistiigi requested review from AkihiroSuda and sipsma June 15, 2022 05:14
@deitch deitch force-pushed the oci-context branch 3 times, most recently from d8a901d to 603f10d Compare June 15, 2022 08:31
@deitch
Copy link
Contributor Author

deitch commented Jun 15, 2022

Back at you @tonistiigi

@deitch
Copy link
Contributor Author

deitch commented Jun 15, 2022

Well, well. CI still is green (if I sound surprised, I am; I was sure I would break something!) 😄

@deitch
Copy link
Contributor Author

deitch commented Jun 15, 2022

Updated the last thing you requested @tonistiigi ; letting CI run and see.

Signed-off-by: Avi Deitcher <avi@deitcher.net>
@tonistiigi tonistiigi merged commit b1720cf into moby:master Jun 16, 2022
"github.com/pkg/errors"
)

// ParseOCILayout parses --oci-layout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update README.md to explain the usage?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @deitch 👀

Sorry to bump this PR, but was trying to use this earlier and struggled to work out the right flags 🎉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't be sorry @jedevc ; if it needs addressing, it should be raised. FWIW, I wrote this PR, and I still need to look back to recall what flags should be used where.

@AkihiroSuda I am fine opening a PR to update the README, but I am not sure where to put it. I don't see a section in the README that shows "here is where you add source contexts".

I also (despite having written this) don't remember the magical invocation. I think it was something like this. For Dockerfile:

FROM akihiro/abc:123
buildctl build --opt context:foo=oci-layout:///dir/to/layout

but something doesn't look right to me. Something is missing about mapping akihiro/abc:123 to context foo, so that foo can map to the oci-layout. Same thing if I were to replace it with docker-image://library/alpine:3.16 or something else.

Help me remember how to do this, and where it goes, and I am happy to add it to the README.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure where to put it

Doesn't look like we have anywhere for it atm, ideally we'd have a new section on "contexts" maybe right before Outputs? (so we'd cover local, docker-container, oci-layout, git, etc - I'm happy to take a look at the others if you could do the oci-layout).

Help me remember how to do this

No idea 😄 I'm at a bit of a loss.

I imagine something like this to work: --opt context:foo=oci-layout://path/to/oci@sha256:6f8bc6b53cc1f29d3ae7542bf7935762ccdff29ab8866e5d3fd85f6b190748f5 - but this gives:

error: failed to solve: unable to get info about digest: Unimplemented: unknown service containerd.services.content.v1.Content: unknown

I imagine this isn't working since it doesn't look like solveOpt.OCIStores is even populated, since the --oci-layout option isn't parsed (but --oci-layout is also never even created as a valid option). I'm kinda confused, it seems like something is definitely broken here (I'm not sure if it ever work with just buildctl, it only seems to have been tested using buildx, which has the right oci-layout parsing logic)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to take a look at the others if you could do the oci-layout).

I don't mind doing that. Why don't you get the others in, and then I can add, maybe on your PR for that?

No idea 😄 I'm at a bit of a loss.

Haha! That makes both of us.

since the --oci-layout option isn't parsed (but --oci-layout is also never even created as a valid option).

I don't think it ever was; wasn't it just for --opt and never a full-fledged --oci-layout? Or maybe it was? @tonistiigi walked me through the implementation step-by-step, as I really struggled to get my head around it.

The docker buildx extensions have a straight mapping of "image name" -> context, e.g. here, so I can do:

FROM akihiro/abc:123

and then

docker buildx build --build-context akihiro/abc:123=oci-layout:///path/to/layout@sha256:abcdefg11111222

But as I recall, the buildctl part was always two steps:

  1. Map the image name or AS alias in the dockerfile to a named context, e.g. akihiro/abc:123 --> foo
  2. Map the named context to a source, e.g. foo ---> oci-layout:///path/to/layout@sha256:abcdefg11111222

The --opt context:foo=... solves the second part, but not the first. I have no memory how to do that from CLI.

I'm not sure if it ever work with just buildctl, it only seems to have been tested using buildx, which has the right oci-layout parsing logic

I really do not recall. Tonis had us do the server-side logic first along with frontend, then buildx. I had though buildctl had the support, since it doesn't need to be explicitly listed (part 1 already works, and part 2 just gets passed to buildkitd), but maybe I am missing something?

I really thought we had run it back when this went in.

I imagine something like this to work... but this gives:

Can you post a sample? Here or on a gist? Let's have a common base to work with?

Copy link
Member

@jedevc jedevc Sep 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To create the OCI folder:

$ rm -rf dump; echo "FROM alpine:3.14" | buildx build -f - --output type=oci,dest=./out.tar .; mkdir dump; (cd dump; tar xf ../out.tar)

With the following Dockerfile:

FROM foo
RUN apk add git

And the following command with the appropriate sha (assuming a running buildkit built off of master listening on port 1234):

$ buildctl --addr tcp://localhost:1234 build --frontend=dockerfile.v0 --local context=. --local dockerfile=. --opt context:foo=oci-layout://dump@sha256:79a06bdbfbf27b365be61ca38646b07b451681475c284ad79deed506ab192917
...
------
 > [context foo] load metadata for dump@sha256:79a06bdbfbf27b365be61ca38646b07b451681475c284ad79deed506ab192917:
------
Dockerfile:1
--------------------
   1 | >>> FROM foo
   2 |     RUN apk add git
   3 |     
--------------------
error: failed to solve: unable to get info about digest: Unimplemented: unknown service containerd.services.content.v1.Content: unknown

Reading through the PR, it looks like we should need to pass --oci-layout here, but that logic isn't plumbed through to cobra properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, will comment there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants