-
Notifications
You must be signed in to change notification settings - Fork 1.3k
git url: support query form + support specifying checksum in query #5974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
921c77c
to
12fb4fa
Compare
@AkihiroSuda Needs rebase after #5975 |
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that is weird atm is the URL parsing in LLB (client lib and identifier) vs Dockerfile. The query form should really be only for Dockerfile/dockerui. In LLB the parameters are sent with attribute values instead. For historical reasons, the current LLB is kind of a mix where #ref
is still part of the URL from initial implementation, but for example protocol of the URL is always git://
and the real protocol is sent as an attribute.
Currently, it looks like in LLB client we parse a full URL (with querystring etc.) but then we drop everything but Host/Path
and send the original URL as attr (that I think will container querystring). Also #ref
is not even part of the URL the user passes in the client function, but is glued to the identifier internally.
Then on LLB receiver side, this now handles querystring in this PR in the URL in NewGitIdentifier (although it looks that it would require client side to error for the querystring to be passed at all there), but actually the checksum gets set later from pb.AttrGitChecksum
.
In order to fix this, I think we should separate these concepts. Assuming that we do need some level of parser in the LLB side, I think having two separate packages might be too much so maybe just two parser functions where the Dockerfile one would parse the querystring and other Dockerfile specific things and then internally call the more basic parser. Looking at the current return type, there is already some weird things there, eg. IndistinguishableFromLocal
seems to be only used in Dockerfile, and UnencryptedTCP
doesn't seem to be used anywhere at all.
I'm also ok with changing the signature for llb.Git
, eg. adding GitSubdir()
(maybe even GitRef()
) so it is the same as other git properties instead of requiring microformat and URL parsing as input that gets converted anyway. On the identifier parser side, we do need to keep backwards compatibility though (but could deprecate the old behavior).
repo.Subdir = u.Fragment.Subdir | ||
if u.Opts != nil { | ||
repo.Ref = u.Opts.Ref | ||
repo.Checksum = u.Opts.Checksum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the place where it is getting checksum from querystring in LLB while it really comes from pb.AttrGitChecksum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there is still a wrong type in here after parsing as the checksum should only come from props (I don't think this is some backwards compatibility?).
@AkihiroSuda Any update? |
@AkihiroSuda Do you have time to look at it? We plan to do first v0.24 RC next week. |
Moving these weirdo to |
f978441
to
7de593a
Compare
Added refactoring commits
|
7de593a
to
e4eb28e
Compare
This comment was marked as resolved.
This comment was marked as resolved.
Fix issue 4905, but the syntax differs from the original proposal. The document will be added to https://github.com/docker/docs/blob/main/content/manuals/build/concepts/context.md#url-fragments Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Prior to this commit, `ADD https://github.com/moby/example.git?invalid=42` was fetching a single HTML file without causing an error. Now it returns an error `unexpected query "invalid"`. Hint: use `git show --ignore-all-space` to review this commit. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
ParseURLBasic is a simplified variant of ParseURL. Basically, the full ParseURL should be used only in the LLB frontends such as `dockerfile`. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
f06fdae
to
3b48201
Compare
Similar to `Git`, but does not parse git url fragments and queries. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
3b48201
to
35db37f
Compare
// TODO: consider allowing this, when the tag actually exists on the branch | ||
url: "https://github.com/moby/buildkit?tag=v1.0.0&branch=v1.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is similar to Checksum
but could potentially be a validation of both branch
and commit
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was testing these changes and so far so good (with also vendoring in buildx client docker/buildx#3380) but I found a use case that would be needed related to checksum validation.
This relates to how the Docker GitHub Actions currently handle Git context. Currently we always checkout by commit sha if ref is not a pull request: https://github.com/docker/actions-toolkit/blob/633bcf1936ffff0e05a4bbe2b23c71abf6f4111e/src/context.ts#L45-L60
That can be be problematic because then git tag is not pulled similar to https://github.com/crazy-max/buildkit/actions/runs/16967780276/job/48096258733#step:7:451. (see also docker/actions-toolkit#677)
But with query form we could use ref=refs/tags/v0.23.0&commit=28c90eadc4c12cc78155ad59ca5f486220241d2a
so it would fetch the tag and validate commit sha. That's good!
But what should we do for a another ref like default branch? I don't think having ref=refs/heads/master&commit=28c90eadc4c12cc78155ad59ca5f486220241d2a
with strict checksum validation makes a lot of sense as there might be race condition when build runs on CI.
WDYT if we have another var to opt-out for checksum validation while keeping ref and commit like:
ref=refs/heads/master&commit=28c90eadc4c12cc78155ad59ca5f486220241d2a&checksum=false
We could have another option like |
} | ||
|
||
// Git2 is similar to Git but takes [gitutil.GitURLBase] as the argument. | ||
func Git2(remote gitutil.GitURLBase, opts ...GitOption) State { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we want to go with numeric functions in here. GitFromRemote
?
}) | ||
} | ||
|
||
func GitIDSuffix(v string) GitOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to make this public if only used internally. Same for prop.
if gu.Remote != "" { | ||
fullURL = gu.Remote | ||
} | ||
gitOptions := []llb.GitOption{llb.WithCustomName(pgName), llb.GitChecksum(gitRef.Checksum), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite get why do we need to call ParseGitRef
and then still pass a bunch of the options manually. Why can't we convert to a structure that already can be connected to llb
. There are so many duplicates in here: user needs to pass parsed GitURLBase
structure, but then still needs to pass GitFullURL
as well. Why is the commit passed to both GitChecksum
and GitIDSuffix
. These look like internal behaviors what the user should not know about. If we parse the input URL we should have remote(full), ref, commit, subdir and we should be able to pass them without duplicates.
repo.Subdir = u.Fragment.Subdir | ||
if u.Opts != nil { | ||
repo.Ref = u.Opts.Ref | ||
repo.Checksum = u.Opts.Checksum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there is still a wrong type in here after parsing as the checksum should only come from props (I don't think this is some backwards compatibility?).
func parseOpts(fragment string) *GitURLOpts { | ||
if fragment == "" { | ||
return nil | ||
func parseOpts(fragment string, query url.Values) (*GitURLOpts, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Querystring is only part of the new dockerfile-style URLs so I don't understand why this parsing is in this pacakge.
Carried and merged as: Thanks @tonistiigi 🙇 |
Split from:
e.g.,
Fix #4905, but the syntax differs from the original proposal.
The document will be added to
https://github.com/docker/docs/blob/main/content/manuals/build/concepts/context.md#url-fragments