Skip to content

blob: S3 downloads don't take advantage of the S3 Manager - so performance is slower then it could be #3600

@alexeiser

Description

@alexeiser

Describe the bug

When using the Golang AWS V2 SDK Manager feature - AWS will use multipart downloads to accelerate the download of data (by default 5 concurrent threads) This amount of concurrency is configurable. The regular download method performs only a single connection, which on AWS ec2 instances, or other networks with high througput doesn't take advantage of the capabilities.

The blob Upload is using the manager - so is already using the concurrent network features.

To Reproduce

Go-cloud function to download a file (and throw it away)

	reader, err := bucket.NewReader(ctx, objectKey, nil)
	if err != nil {
		fmt.Fprintf(os.Stderr, "Error creating reader for object: %v\n", err)
		return
	}
	defer reader.Close()

	// Create a local file to write the downloaded content
	var Discard io.Writer = io.Discard

	// Copy the content from the S3 object to the local file
	if _, err := io.Copy(Discard, reader); err != nil {
		fmt.Fprintf(os.Stderr, "Error copying object to local file: %v\n", err)
		return
	}

AWS v2 SDK S3 Manager function:

// DiscardWriterAt wraps io.Discard to implement io.WriterAt
type DiscardWriterAt struct{}

func (d DiscardWriterAt) WriteAt(p []byte, off int64) (n int, err error) {
	return io.Discard.Write(p)
}


    s3Client := s3.NewFromConfig(cfg, func(o *s3.Options) {
        o.Region = "us-west-2"
        o.DisableLogOutputChecksumValidationSkipped = true
    })
    downloader := manager.NewDownloader(s3Client)

    _, err = downloader.Download(ctx, DiscardWriterAt{}, &s3.GetObjectInput{
            Bucket: aws.String(bucketName),
            Key:    aws.String(objectKey),
    })

Experiments run in AWS on a t3a.large
For a large file in s3 (e.g. 2 GB) the gocloud method will take 20+ seconds to download the file, while the s3Manager method will take 7 seconds.

Expected behavior

Performance of upload / download should be symmetrical if the network is symetrical.

Version

Latest, v0.41.0, v0.42.0 and v0.39.0

Additional context

See previous discussion in #3596

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions