Skip to content

A discrepancy between github.com/klauspost/compress/gzip and compress/gzip with a partial read followed by io.Copy? #803

@adamroyjones

Description

@adamroyjones

Firstly: I'm sorry for the floundering and imperfect code below. Things are very cloudy to me.

The readme says that:

[t]he packages are drop-in replacements for standard libraries. Simply replace the import path to use them[.]

I noticed a discrepancy between github.com/klauspost/compress/gzip and compress/gzip that I don't really understand. With this test file

package main

import (
	"bufio"
	"bytes"
	"compress/gzip"
	"io"
	"testing"

	kgzip "github.com/klauspost/compress/gzip"
	"golang.org/x/exp/slices"
)

var (
	in           = []byte("hello\nworld")
	compressedIn []byte
)

func init() {
	var buf bytes.Buffer
	gz := gzip.NewWriter(&buf)
	if _, err := gz.Write(in); err != nil {
		panic(err)
	}
	if err := gz.Flush(); err != nil {
		panic(err)
	}
	if err := gz.Close(); err != nil {
		panic(err)
	}
	compressedIn = buf.Bytes()
}

func TestBufferedPartialCopyGzip(t *testing.T) {
	gz, err := gzip.NewReader(bytes.NewReader(compressedIn))
	if err != nil {
		t.Errorf("constructing a reader: %v", err)
	}

	br := bufio.NewReader(gz)
	if _, err := br.ReadBytes('\n'); err != nil {
		t.Errorf("reading to the first newline: %v", err)
	}

	var out bytes.Buffer
	_, err = io.Copy(&out, br)
	if !slices.Equal(out.Bytes(), []byte("world")) {
		t.Errorf("unexpected output when reading the remainder")
	}
	if err != nil {
		t.Errorf("reading the remainder: %v", err)
	}
}

func TestBufferedPartialCopyKgzip(t *testing.T) {
	gz, err := kgzip.NewReader(bytes.NewReader(compressedIn))
	if err != nil {
		t.Errorf("constructing a reader: %v", err)
	}

	br := bufio.NewReader(gz)
	if _, err := br.ReadBytes('\n'); err != nil {
		t.Errorf("reading to the first newline: %v", err)
	}

	var out bytes.Buffer
	_, err = io.Copy(&out, br)
	if !slices.Equal(out.Bytes(), []byte("world")) {
		t.Errorf("unexpected output when reading the remainder")
	}
	if err != nil {
		t.Errorf("reading the remainder: %v", err)
	}
}

and with v1.16.4, I find that the test for compress/zip passes, but the test for github.com/klauspost/compress/gzip does not. The error is:

main_test.go:72: reading the remainder: gzip: invalid checksum

There are three interesting things.

  • The error checking was put after the slice equality check deliberately to illustrate that the io.Copy call copies all of the data correctly, but an error's produced nonetheless.
  • Removing the partial read (the call to br.ReadBytes) resolves the error.
  • Replacing io.Copy with something like io.ReadAll resolves the error.

That is, there's something specific about the combination of br.ReadBytes, io.Copy, and github.com/klauspost/compress/gzip which triggers this error.

Does anyone have any ideas?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions