Skip to content

Conversation

simi
Copy link
Contributor

@simi simi commented Apr 15, 2022

Using file with enormous file extension size can raise an exception like:

Errno::ENAMETOOLONG: Filename too long @ rb_sysopen - /tmp/RackMultipart20220414-6-lm7dab.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

It should be safe to cut the file extension size to 20 (we can pick also bigger number if needed). That way tempfile name size is predictable.

I have checked few online lists of file extensions and it seems limit of 20 is ok. The longest common file extension seems to be crdownload (10 characters).

ℹ️ I would like to backport this to 2-2-stable if approved.

request = make_request Rack::MockRequest.env_for("/", options)
request.POST

refute_includes(file.path, "ThisIsVeryLongExtension")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fragile (what if we change 20 to 32 or 16)? Since this is being written to avoid the ENAMETOOLONG, in general we should write the spec that fails with ENAMETOOLONG, and then check that the change to the multipart parser fixes it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. I have find out also one problem I have introduced (missing . in stripped extension). Code changed. Test now fails with Errno::ENAMETOOLONG without change in parser.

@simi simi force-pushed the cut-long-extension branch from ce97521 to 055bd04 Compare April 15, 2022 21:26
Copy link
Contributor

@jeremyevans jeremyevans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor change requested, then this looks good to me.

@simi simi force-pushed the cut-long-extension branch from 055bd04 to 84509dd Compare April 15, 2022 21:43
@simi
Copy link
Contributor Author

simi commented Apr 15, 2022

@jeremyevans are you ok with backport?

@jeremyevans
Copy link
Contributor

@jeremyevans are you ok with backport?

I think this is OK to backport.

@ioquatix
Copy link
Member

On what OS is this a problem?

@simi
Copy link
Contributor Author

simi commented Apr 15, 2022

On what OS is this a problem?

official Ruby docker Alpine image -> Alpine Linux 3.15

@simi
Copy link
Contributor Author

simi commented Apr 15, 2022

ℹ️ CI error seems unrelated (known JRuby CI problem).

@ioquatix
Copy link
Member

ioquatix commented Apr 15, 2022

After discussing it, here is my suggestion/advice.

I don't think any user data, file extension, or otherwise, should make it to the file system directory list by default. So I think we agree here, and the matter is now just compatibility.

Including the file extension has potentially become an ossified standard. I think this can be a potential security issue but initial investigation looks okay - i.e. it looks like File.extname protects us.

I would be much more comfortable with us doing something like what Jeremy has done in the past. Add a configuration option which we can eventually deprecate/default to a sane/secure default.

Rack::Multipart.file_name_extension_limit = 20 # file name extension is limited to this many characters - better than current
Rack::Multipart.file_name_extension_limit = nil # no file name extension - best/secure by default.
# or maybe Rack::Mutlipart::Uploadedfile.file_name_extension_limit
Rack::Multipart.file_name = "RackMultipart" # the name of the file?

I would suggest we have a mechanism for warning/informing people about these defaults, because it seems like right now we want to introduce a default (e.g. 20), but utimately we'd prefer to have no file extension at all.

This isn't an area I've spent much time in the code so it would be up to yourself to propose the best mechanism for configuration. The above is just an idea.

@simi
Copy link
Contributor Author

simi commented Apr 15, 2022

I don't see any big benefit in making this configurable. As an alternative approach, we can just skip the extension if threshold is reached or even we can just rescue Errno::ENAMETOOLONG and try without extension.

@ioquatix
Copy link
Member

ioquatix commented Apr 15, 2022

I don't see any big benefit in making this configurable.

The benefit is preserving backwards compatibility while providing a way forward with the new and improved code path. Whether that matters or not really depends on the implementation.

@jeremyevans
Copy link
Contributor

I'm not sure if trying to avoid the use of an extension for the temporary file is a good long term goal. Seems like that would break cases where the user is taking the extension of the temporary file and using Rack::Mime to get the mime type? I suppose you can tell them to get the extension some other way, but that seems like pointlessly breaking their code. I checked and the no extension supported by Rack::Mime is longer than 10 characters.

@ioquatix
Copy link
Member

Can't someone use the file extension of the provided file name, rather than the tempfile path name?

@jeremyevans
Copy link
Contributor

Can't someone use the file extension of the provided file name, rather than the tempfile path name?

They could. That's what I was discussing when I said "I suppose you can tell them to get the extension some other way, but that seems like pointlessly breaking their code."

@ioquatix
Copy link
Member

"I suppose you can tell them to get the extension some other way, but that seems like pointlessly breaking their code."

That's assuming that anyone with a 20 character file extension expects to do anything meaningful with it. Remember we are talking about cases which are already exceeding anything longer than what is in the mime table.

Also, Are people actually using tempfile.path rather than the filename given in the params itself? I can't imagine anyone is doing that. Do you know any examples?

@simi
Copy link
Contributor Author

simi commented Apr 16, 2022

I got also probably final idea. What about to provide something like "rack_invalid_extension" or some similar constant for extension failing the sanity check? That way it would be clear if inspecting the problem manually, extension is not missing (so it is clear extension was provided) and there is again no need to modify the tempfile name based on original extension.

@simi
Copy link
Contributor Author

simi commented Apr 26, 2022

I was thinking about this again and I'm definitelly sold on my latest comment. I'll update PR. 🙏 If welcomed, I can make extension size configurable.

@ioquatix
Copy link
Member

I think it's totally reasonable.

@simi
Copy link
Contributor Author

simi commented Apr 26, 2022

ℹ️ I got some alternative ideas, I'm going to explore more on this and open new PR later.

@simi simi closed this Apr 26, 2022
@simi simi deleted the cut-long-extension branch April 26, 2022 22:50
@ioquatix
Copy link
Member

@simi did we ever fix this?

@simi
Copy link
Contributor Author

simi commented Sep 22, 2022

Nope @ioquatix, still problem. I was thinking about introducing lazy multipart parsing in Rails instead to be able to react in "user land" to similar problems. There is a lot of potential problems in here. For example . isn't extension separator every time, it can be just part of the filename (I have seen filenames sanitized by replacing spaces with dots 😨).

If we would like to fix this still in Rack, I would go for random tempfile names and keeping original name around to be able to do checks (mime type?) on that instead of tempfile name.

@ioquatix
Copy link
Member

This issue cropped up again in our production environment.

I think you are right, the file name on disk should not be part of the "content type" or any other user supplied data = security risk. Let's have a think about how to deal with this. I like your proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants