-
Notifications
You must be signed in to change notification settings - Fork 654
[Storage Cleaner] Add option to get full paths when listing entries #379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Storage Cleaner] Add option to get full paths when listing entries #379
Conversation
1a4e300
to
9c6cd51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but would resolve_path(s)
be a better argument name that full_path
?
If I called |
Fair enough! |
# The unarchived file could have a redundant top-level directory. If the top-level | ||
# directory has only a directory, we should return that directory's entries instead. | ||
# We do not pass max_file_size to avoid accidentally skipping files. | ||
entry_paths = storage.list_entries(run_archive_path, full_path=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I call this with full_path=True
, it will download the archive (if necessary), and then return a local path? And it always does this, so you can assert
on it later? Why is it called full_path
then? I would expect full_path
to just return completely absolute paths. Maybe it should be called full_local_path
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems weird from an API perspective that list_entries
would be a function that can force a full download, and whether it does is controlled with a flag called full_path
?
This PR adds the option to get full file/directory paths when listing entries of an archive file or directory. For archive files, this may change the path from a cloud URL to a local file path. This PR has multiple benefits:
list_entries
no longer need to worry about whether the path corresponds to an archive file or a directory when they are using the results. They just need to be mindful that the entries could be in a different storage system._get_archive_run_entry_paths
clearly points out this issue and deals with both the presence and absence of the extra nested dir.PR Train: