Skip to content

Conversation

giuseppe
Copy link
Member

@giuseppe giuseppe commented Jun 1, 2023

change the file format to store the tar-split as part of the zstd:chunked image. This will allow clients to rebuild the entire tarball without having to download it fully.

also store the uncompressed digest for the tarball, so that it can be stored into the storage database.

Needs: containers/image#1976

Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com

@giuseppe giuseppe force-pushed the chunked-store-tarsplit branch 6 times, most recently from da1b8ad to 561ef29 Compare June 1, 2023 14:40
@giuseppe giuseppe force-pushed the chunked-store-tarsplit branch 4 times, most recently from 5408a8f to c4b7edb Compare June 5, 2023 10:34
@giuseppe giuseppe force-pushed the chunked-store-tarsplit branch 2 times, most recently from 1704458 to 62f24ea Compare June 5, 2023 13:05
@giuseppe giuseppe force-pushed the chunked-store-tarsplit branch 2 times, most recently from 02d9ae3 to c65dd10 Compare June 6, 2023 19:00
@rhatdan
Copy link
Member

rhatdan commented Jun 6, 2023

LGTM
@mtrmac PTAL

@giuseppe giuseppe force-pushed the chunked-store-tarsplit branch from c65dd10 to 91a1fe9 Compare June 7, 2023 09:21
@giuseppe
Copy link
Member Author

giuseppe commented Jun 7, 2023

these fixes are independent of the work in c/image, except for chunked: add function to retrieve TOC digest

@rhatdan
Copy link
Member

rhatdan commented Jun 8, 2023

LGTM
@mtrmac It is up to you

@giuseppe giuseppe force-pushed the chunked-store-tarsplit branch from 0fd208b to 3919680 Compare June 12, 2023 20:16
@giuseppe
Copy link
Member Author

rebased

@giuseppe
Copy link
Member Author

I'd like to use these fixes for the new composefs integration work. Could we move forward and if there is anything to change/fix we could address it later?

@giuseppe giuseppe force-pushed the chunked-store-tarsplit branch from 45e6d1f to 1c356e2 Compare June 15, 2023 16:00
@giuseppe
Copy link
Member Author

all comments adressed, and the CI is green

@@ -383,7 +426,20 @@ func writeZstdChunkedStream(destFile io.Writer, outMetadata map[string]string, r
}
zstdWriter = nil

return internal.WriteZstdChunkedManifest(dest, outMetadata, uint64(dest.Count), metadata, level)
if err := tarSplitData.zstd.Flush(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch doesn't close tarSplitData.zstd, and a number of branches above it don't close zstdWriter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a func for zstdWriter to clean it up if it is not closed (and thus set to nil afterwards):

	defer func() {
		if zstdWriter != nil {
			zstdWriter.Close()
		}
	}()

is that sufficient?

I've added the same pattern for tarSplitData

@nalind
Copy link
Member

nalind commented Jun 16, 2023

Mostly LGTM, just nits. restartCompression() in compressor.go is calling zstdWriter.Flush() after zstdWriter.Close(), which looks redundant.

@TomSweeneyRedHat
Copy link
Member

LGTM other than @nalind's comments

giuseppe added 13 commits June 17, 2023 00:31
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
change the file format to store the tar-split as part of the
zstd:chunked image.  This will allow clients to rebuild the entire
tarball without having to download it fully.

also store the uncompressed digest for the tarball, so that it can be
stored into the storage database.

Needs: containers/image#1976

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe
Copy link
Member Author

Mostly LGTM, just nits. restartCompression() in compressor.go is calling zstdWriter.Flush() after zstdWriter.Close(), which looks redundant.

thanks, I've added a separate patch to drop it since it was already this way (so in case of regressions, it is easier to bisect it)

@giuseppe giuseppe force-pushed the chunked-store-tarsplit branch from 1c356e2 to 032aae3 Compare June 16, 2023 22:34
@rhatdan
Copy link
Member

rhatdan commented Jun 17, 2023

LGTM

@rhatdan rhatdan merged commit c989f8f into containers:main Jun 17, 2023
Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching up for containers/image#1980 … quite possibly the questions have been resolved since. Follow containers/image#1980 for progress.

}
if err := ioutils.AtomicWriteFile(r.tspath(layer.ID), tsdata.Bytes(), 0o600); err != nil {
return err
}
}
for k, v := range diffOutput.BigData {
if err := r.SetBigData(id, k, bytes.NewReader(v)); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetBigData already calls saveFor(layer), so moving the saveFor in this function past this loop can not reliably achieve anything unusual, and to me only seems confusing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1844 proposes reverting the move.

}

offset = binary.LittleEndian.Uint64(footer[0:8])
length = binary.LittleEndian.Uint64(footer[8:16])
lengthUncompressed = binary.LittleEndian.Uint64(footer[16:24])
manifestType = binary.LittleEndian.Uint64(footer[24:32])
if !isZstdChunkedFrameMagic(footer[32:40]) {
return nil, 0, errors.New("invalid magic number")
if !isZstdChunkedFrameMagic(footer[48:56]) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICS WriteZstdChunkedManifest silte writes the magic at 32:40, and the other semantics (offset/length/…) match the offsets in that method.

So how does this work?!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for spotting this.

I've opened a PR to fix it and add some tests: #1697

I am not 100% sure what to do with the footer data, this code path is not used by us at the moment, since we store this information in the OCI annotations for the layer (and if the OCI annotations are missing we don't even attempt a partial pull).
On the other hand it seems nice to have this information somewhere in the layer file itself, considering it is not a lot of data anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think plausible alternatives are:

  • Keep it as a frozen artifact, supporting only the old features. If the old features are sufficient to pull/push useful images.
  • Design some kind of forward-compatibility mechanism. Maybe a new magic value, and now including a version identifier. And the design would need to allow adding more fields, ie.e. the magic/version ID would need to be at a predictable place regardless of version. And it wouldn’t be a replacement for the annotations because we need the layer’s metadata to be authenticated without having to read all of the layer. If we are not consuming the data at all, all of this seems to me to be very likely to keep breaking, and just not worth the effort.
  • Redesign so that this data does not need extending: only include data about the manifest in here, and add all new fields in the manifest instead. That seems to me to be the cleanest way to have the layer blob be ~self-describing… OTOH in principle it would be possible for an attacker to point at one manifest in this footer and another in the annotation, so I’m not sure there’s that much of a benefit to being self-describing, either.
  • Completely drop support, and rely on annotations only? I didn’t look into the history, whether we can do that.


d, err := digest.Parse(manifestChecksumAnnotation)
func decodeAndValidateBlob(blob []byte, lengthUncompressed uint64, expectedUncompressedChecksum string) ([]byte, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The digest is actually a compressed digest.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #1844.

Comment on lines +144 to +151
if tocDigest, ok := annotations[estargz.TOCJSONDigestAnnotation]; ok {
d, err := digest.Parse(tocDigest)
if err != nil {
return nil, err
}
return &d, nil
}
if tocDigest, ok := annotations[internal.ManifestChecksumKey]; ok {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetDiffer processes the digests in the opposite order. Doesn’t that end up using an arbitrary layer ID which differs from the manifest we actually process?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was fixed here; #1844 also makes GetTOCDigest unambiguous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants