-
Notifications
You must be signed in to change notification settings - Fork 261
Improve error handling in archive creation #2289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8eeb935
to
994c408
Compare
pkg/archive/archive.go
Outdated
// a whiteout header | ||
if wo != nil { | ||
if hdr.Typeflag != tar.TypeReg || hdr.Size == 0 { | ||
if err := ta.TarWriter.WriteHeader(hdr); err != nil { | ||
return err | ||
} | ||
if hdr.Typeflag == tar.TypeReg && hdr.Size > 0 { | ||
return fmt.Errorf("tar: cannot use whiteout for non-empty file") | ||
} | ||
hdr = wo | ||
hdr = headers.whiteout | ||
} else { | ||
logrus.Infof("tar: cannot use whiteout for non-empty file %s", hdr.Name) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don‘t understand the “pkg/archive: remove "whiteout for non-empty file" error handling” commit, I think the original code was correct.
If the converter returns a new whiteout header, the quoted code calls WriteHeader
on the original hdr
, and then immediately follows WriteHeader(theNewWhiteout)
. In that case, the Size > 0
situation (as you say) would cause an invalid archive to be created — so why does that commit turns that problematic situation from an error into a warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://github.com/containers/storage/blob/main/pkg/archive/archive.go#L634,
if err := ta.TarWriter.WriteHeader(hdr); err != nil {
return err
}
if hdr.Typeflag == tar.TypeReg && hdr.Size > 0 {
return fmt.Errorf("tar: cannot use whiteout for non-empty file")
}
hdr = wo
the check flags a problematic situation, but it should have been made before writing the header with Size > 0, because the current approach breaks the entire archive without flagging an error to the caller. I don't know why it was written this way; given the state of this code, I suspect carelessness. Removing the check doesn't make a test fail, and moby/moby has this erroneous bit until today.
It was introduced in moby/moby@daa7019 . It already ignore the error return from addFile.
for us, we have to decide between:
- write the header, write file, then write whiteout
- ignore the whiteout if file.size >0
- ignore the file + whiteout if file.size > 0
- fail the archive creation
I can't tell which one is the right solution , but clearly the current code is broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking a bit further, the only WhiteoutConverter is the one in archive_linux.go. Its ConvertWrite
only takes action on directories and char devices, so this condition can never trigger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are in agreement that writing a whiteout header with .Size > 0
would cause a corrupt file to be created. And that detecting that only after the WriteHeader
is not ideal (but I think it doesn’t really matter as long as the overall work reports a failure.)
Where the disconnect seems to be is that previously the code reported a failure, and now it fails and produces incorrect output. I think that’s a move in the wrong direction?!
the only
WhiteoutConverter
is the one inarchive_linux.go
. ItsConvertWrite
only takes action on directories and char devices, so this condition can never trigger.
Yes. But, sadly, it’s a public API, so we can’t change WhiteoutConverter
itself.
Anyway, I think I’d be fine with just code comments (on the caller, noting that the only possible implementation never sets Size > 0
, and on the whiteout implementation, reminding that the implementation must preserve that property) — but keeping the check with the hard failure would provide a stronger guarantee, and the code ~already exists…
So I think just moving the error check before the WriteHeader
might be best here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But, sadly, it’s a public API, so we can’t change WhiteoutConverter itself.
the converter is a public API, but it is only selected through TarOptions.WhiteoutFormat, whose interpretation in GetWhiteoutConverter we control.
I did some more reading. The AUFS and overlayfs file systems have different ways of indicating deletions. This functionality is converts between them (where overlayfs -> aufs may introduce 2 entries for a single directory). The assumption therefore is that deletions are never marked by a non-empty files, and that a single whiteout in one format may only expand to 2 entries (without content) in the other format.
I'll add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where the disconnect seems to be is that previously the code reported a failure,
it returned an error, but then that error only resulted in a log message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Reading this line by line, this is fair enough, but I’m afraid that with this as is, future readers/maintainers would be more confused than with the current code.
Can you add documentation to addFileData
? Primarily answering why is the “add file” functionality split into two functions (and, if adding new features in the future, in which of the two functions those features should go), and maybe anything else that maintainers should know.
Looking at #2261 (comment) , I guess the answer is “we have existing tests that require some failures to be ignored, so this is a way to structure the code to allow that, without ignoring others”. That could be a fair answer as far as it goes, but now it’s not obvious from the code without such a documentation…
… but, also, I’m inclined to push for details on the “require write failures to be ignored” part. I’m not at all saying this is incorrect, but someone needs to understand that this is the case, and that we should preserve that behavior — that it is not a bug in the tests where we are now doing a lot of work to reinforce that bug. And I’ll fully admit that I didn’t look into this: I assume you did, so please write down what you found and where we are. (On this part, I’m completely ignorant and I’d happily defer to maintainers more familiar with the code.)
Oh and the lint task is failing, with some inscrutable failure… is that an infrastructure problem, or are we somehow throwing away the logs in CI? |
these functions run on live containers, so another process may be mutating the file we're trying to copy out, so there can always be errors, eg. the path we're trying to archive was just removed by another process, so os.Lstat returns ENOENT. This code could be much more resistant to races (eg. using openat(2) and friends to fix a file entry to export in just one place.), but it can't be fundamentally fixed: the container may truncate a file while we're writing it to the tar, causing a short read, and there is no way to recover from this. With this change, we can be sure that we notice before handing back a corrupted file to the caller. |
I added this as a comment to TarWithOptions. |
with some local tweaks:
Looks like it's taking 6 seconds on my 16-core machine to check if I have a signed-off line in my commits. 🤷♂️ note that my 'main' branch is not upstream's main branch. |
994c408
to
15d58f1
Compare
I like this, as well as the comments on the two functions around |
66f477f
to
b55f0dd
Compare
PTAL |
b55f0dd
to
9f1bca0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@mtrmac ping? |
@mtrmac - would you be able to approve this? Or is there someone else who can give the 2nd LGTM? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
65043ee
to
d46957c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some minor nits; SGTM overall
pkg/archive/archive.go
Outdated
@@ -850,13 +850,19 @@ func extractTarFileEntry(path, extractDir string, hdr *tar.Header, reader io.Rea | |||
} | |||
|
|||
// Tar creates an archive from the directory at `path`, and returns it as a | |||
// stream of bytes. | |||
// stream of bytes. This is a convenience wrapper for TarWithOptions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: may use markup while at it, i.e.
// stream of bytes. This is a convenience wrapper for TarWithOptions. | |
// stream of bytes. This is a convenience wrapper for [TarWithOptions]. |
so that go doc will link to this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
pkg/archive/archive.go
Outdated
// TarWithOptions creates an archive from the directory at `path`, | ||
// only including files whose relative paths are included in | ||
// `options.IncludeFiles` (if non-nil) or not in | ||
// `options.ExcludePatterns`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally speaking, it makes no sense to use backticks
in docstrings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed. (I was following conventions from rest of file.)
pkg/archive/archive.go
Outdated
// addFile adds a file from `path` as `name` to the tar archive. | ||
func (ta *tarWriter) addFile(path, name string) error { | ||
type addFileData struct { | ||
// path from which to read contents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: missing period
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
pkg/archive/archive.go
Outdated
// path from which to read contents | ||
path string | ||
|
||
// os.Stat for the above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: missing period
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
pkg/archive/archive_test.go
Outdated
func TestTarErrorHandling(t *testing.T) { | ||
dir := t.TempDir() | ||
|
||
for i := 0; i < 10; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can use for i := range 10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
This routine does no I/O, so it cannot fail. Signed-off-by: Han-Wen Nienhuys <hanwen@engflow.com>
Signed-off-by: Han-Wen Nienhuys <hanwen@engflow.com>
Signed-off-by: Han-Wen Nienhuys <hanwen@engflow.com>
Otherwise, if addFile errors out before writing the entry for a file with hardlink, subsequent files with hardlinks would reference a file that is not in the tar file. Use hdr.Name (which is passed through CanonicalTarNameForPath) instead of the 'name' parameter. Signed-off-by: Han-Wen Nienhuys <hanwen@engflow.com>
This code was imported in commit f39066f (Sep 7, 2017, "Update packages to match latest code in moby/pkg") without any further comments. The error check cannot trigger, because there is only whiteout converter (overlayfs => AUFS), and that only acts on char devices and directories. Document in detail what is going on. Signed-off-by: Han-Wen Nienhuys <hanwen@engflow.com>
This allows disinguishing between inherent fatal error (writing to the tar archive failed, eg. disk full) and non-fatal error (tarWithOptionsTo is racing with another process mutating the file system.) Signed-off-by: Han-Wen Nienhuys <hanwen@engflow.com>
d46957c
to
39fc1cb
Compare
@kolyshkin ping? |
@giuseppe - seems to need an LGTM (again?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe, hanwen-flow, kolyshkin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
No description provided.