-
Notifications
You must be signed in to change notification settings - Fork 262
store: new Dedup() API #2176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store: new Dedup() API #2176
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
f9b9aea
to
9e346ac
Compare
34a423c
to
f9b8af2
Compare
f9b8af2
to
0f7962c
Compare
0f7962c
to
533f1ae
Compare
533f1ae
to
5ad4ae9
Compare
1aba6ba
to
ecf545a
Compare
ecf545a
to
8784214
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a brief drive-by skim, I didn’t read the PR carefully I’m afraid.
2d59a54
to
bb6967b
Compare
what do you think of the new version? It adds some duplication in the Alternatively, we have |
441a354
to
8f831dc
Compare
Great, works for me.
(I’m not that worried about these simple types with no dependencies and no runtime cost; we can always add aliases from other packages to make the API more convenient — as long as the types/fields make sense to commit to.) Here the caller is certainly going to import the top-level |
8f831dc
to
f5fae66
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d prefer addressing also at least #2176 (comment) and #2176 (comment) , while we still remember this context.
f5fae66
to
375785e
Compare
documented that the dedup result accounts for previous runs, and changed the code to deal with multiple file systems |
c1952d2
to
f895bde
Compare
@containers/podman-maintainers PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really a full review as I am not familiar with the code base and intentions here but error handling could need some work I believe
offers a way to iterate a list of directories and find duplicate files. The duplicate files are deduplicated using reflinks where supported. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
expose a new API to deduplicate files in the images. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
introduce a new `dedup` command to the `containers-storage` tool to deduplicate similar files in image layers. Reflinks support from the underlying file system is needed. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
f895bde
to
fb7bf39
Compare
ACK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the error parts I looked at least. For the actual storage details I defer to others
/lgtm |
offer a way to deduplicate files in the local storage using reflinks when possible.
For now, it deduplicates only identical files, even if reflinks support deduplication only of a block aligned portion of a file