Skip to content
This repository was archived by the owner on Jul 13, 2023. It is now read-only.
This repository was archived by the owner on Jul 13, 2023. It is now read-only.

Paperclip copies too many files to the file system #1642

@amilligan

Description

@amilligan

I don't want to depend on the file system. Providers like Heroku like to pull the file system out from under running servers at inconvenient times, and those errors are just a load of fun to track down. So, I try to use IO streams whenever possible. In particular, if I save some content I've generated (say, in a background task), I feed that content in to Paperclip using a StringIO object, or something similar.

Up until recently this worked great. Now, with spoofing detection and content type validation (which seems like good ideas, btw, just maybe not so great in practice yet) Paperclip feels the need to copy my stream to the file system. Multiple times.

First, the StringioAdapter creates a temporary file so it can run the content type check using a file command. Two problems with this: first, if it's content I created I want to tell Paperclip to trust me and skip this step; second, the file command seems to be wrong about 60% of the time, creating validation errors for perfectly valid files. For instance, it returns 'text/plain' for a CSV file, and 'application/zip' for an XLSX file; not great for anyone generating spreadsheet reports.

Later, in order to do the spoof detection validation, Paperclip wraps my attachment in another adapter (AttachmentAdapter), which copies the content from the first temporary file to another temporary file. Let's hope that first temporary file is still there. And, again, if I generated the content myself then I'm fairly certain there's no spoofing going on. Again, Paperclip has no option to turn off this validation.

Finally, when saving to the file system, Paperclip assumes that temporary files already exist and it can simply do a file copy from one place to another; sadly, those temporary files don't exist because I patched that nonsense out. Now, no sensibly minded person stored content on the file system on a deployed application, and the S3 storage adapter treats the content (appropriate) as a stream. So, this is just an annoyance for development. But, an annoyance all the same.

I've solved these problems in a little library here: https://github.com/buildgroundwork/paperclip-trusted-io

I'm happy to turn these changes into pull requests, if they have any chance of being a) considered in some soft of timely fashion, and b) accepted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions