-
-
Notifications
You must be signed in to change notification settings - Fork 10.4k
Fix cached download file extension for certain URLs #3657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PHP URLs have the downloadable file in the middle of the pathname. If no extension is detected, continue up the pathname.
My concern is this is a widely generic change that will affect the behaviour of a bunch of other formulae in a way that may not be correct. The downloaded filename is based off the formula and version and not the upstream name. |
Right, I should have used a better example. I can certainly change it to make it fix the very specific case of php's file suffix, but I don't think it will change any other formulae that don't also exhibit the same problem. That is, tarballs in the cache directory that don't have any extension. I could also crawl through hombrew-core to get a better representation of downloaded resources and put in more tests. |
@@ -303,7 +303,11 @@ def ext | |||
# We can't use basename_without_params, because given a URL like | |||
# https://example.com/download.php?file=foo-1.0.tar.gz | |||
# the extension we want is ".tar.gz", not ".php". | |||
Pathname.new(@url).extname[/[^?]+/] | |||
url_pathname = Pathname.new(@url) | |||
until ext = url_pathname.extname[/[^?]+/] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably handle the case where the extname never matches and/or dirname
fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kabel Ping on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I addressed this in 0254605 by checking the return value of dirname.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kabel That's not sufficient. For example, if url_pathname
ends up as /
then this will be an infinite loop. I think it's worth changing the approach so this loop can never be infinite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did test it out through irb. I could add the root /
as another escape condition, but these are going to be URLs that we're dealing with. So the top-level domain would (in most cases where it going that far) be returned as the extension. If the URL didn't have a domain (file:///
), dirname would still default to .
. I guess it can't hurt to be even more robust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kabel I'm sorry if I'm not being explicit enough but I'm just not happy with a change that adds an infinite loop if the return values are ever something you do not expect. Instead of making this loop indefinitely until there's an escape condition instead consider splitting on /
and iterating over that or another approach that makes this a bounded loop.
Ok. I think it's fine then if better error handling is added. |
Should never reach this as most URL's will have a domain name that will be caught as an extname.
@@ -303,7 +303,11 @@ def ext | |||
# We can't use basename_without_params, because given a URL like | |||
# https://example.com/download.php?file=foo-1.0.tar.gz | |||
# the extension we want is ".tar.gz", not ".php". | |||
Pathname.new(@url).extname[/[^?]+/] | |||
url_pathname = Pathname.new(@url) | |||
until ext = url_pathname.extname[/[^?]+/] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kabel That's not sufficient. For example, if url_pathname
ends up as /
then this will be an infinite loop. I think it's worth changing the approach so this loop can never be infinite.
@@ -303,7 +303,11 @@ def ext | |||
# We can't use basename_without_params, because given a URL like | |||
# https://example.com/download.php?file=foo-1.0.tar.gz | |||
# the extension we want is ".tar.gz", not ".php". | |||
Pathname.new(@url).extname[/[^?]+/] | |||
url_pathname = Pathname.new(@url) | |||
until ext = url_pathname.extname[/[^?]+/] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kabel I'm sorry if I'm not being explicit enough but I'm just not happy with a change that adds an infinite loop if the return values are ever something you do not expect. Instead of making this loop indefinitely until there's an escape condition instead consider splitting on /
and iterating over that or another approach that makes this a bounded loop.
@MikeMcQuaid I understand the displeasure with the potential unbounded loop. Should be better now with |
Nice find on |
brew tests
with your changes locally?PHP URLs have the downloadable file in the middle of the pathname. If no extension is detected, continue up the pathname. Ex:
https://php.net/get/php-7.2.1.tar.xz/from/this/mirror
should cache tophp-7.2.1.tar.xz
.