-
Notifications
You must be signed in to change notification settings - Fork 821
Description
Describe the bug
When using the Golang AWS V2 SDK Manager feature - AWS will use multipart downloads to accelerate the download of data (by default 5 concurrent threads) This amount of concurrency is configurable. The regular download method performs only a single connection, which on AWS ec2 instances, or other networks with high througput doesn't take advantage of the capabilities.
The blob Upload is using the manager - so is already using the concurrent network features.
To Reproduce
Go-cloud function to download a file (and throw it away)
reader, err := bucket.NewReader(ctx, objectKey, nil)
if err != nil {
fmt.Fprintf(os.Stderr, "Error creating reader for object: %v\n", err)
return
}
defer reader.Close()
// Create a local file to write the downloaded content
var Discard io.Writer = io.Discard
// Copy the content from the S3 object to the local file
if _, err := io.Copy(Discard, reader); err != nil {
fmt.Fprintf(os.Stderr, "Error copying object to local file: %v\n", err)
return
}
AWS v2 SDK S3 Manager function:
// DiscardWriterAt wraps io.Discard to implement io.WriterAt
type DiscardWriterAt struct{}
func (d DiscardWriterAt) WriteAt(p []byte, off int64) (n int, err error) {
return io.Discard.Write(p)
}
s3Client := s3.NewFromConfig(cfg, func(o *s3.Options) {
o.Region = "us-west-2"
o.DisableLogOutputChecksumValidationSkipped = true
})
downloader := manager.NewDownloader(s3Client)
_, err = downloader.Download(ctx, DiscardWriterAt{}, &s3.GetObjectInput{
Bucket: aws.String(bucketName),
Key: aws.String(objectKey),
})
Experiments run in AWS on a t3a.large
For a large file in s3 (e.g. 2 GB) the gocloud method will take 20+ seconds to download the file, while the s3Manager method will take 7 seconds.
Expected behavior
Performance of upload / download should be symmetrical if the network is symetrical.
Version
Latest, v0.41.0, v0.42.0 and v0.39.0
Additional context
See previous discussion in #3596