Refactor archive buffering #7542

unclejack · 2014-08-12T16:40:04Z

This PR refactors archive to add buffering to help fix #7291. This PR is the first one of the two PRs required to fix #7291.

The first commit (move some io related utils to pkg/ioutils) moves some Readers and Writers to pkg/ioutils. I've moved these first so that I would avoid adding WriteCloseWrapper to utils/utils.go. I've moved only the io utils which had the least dependencies on other things (e.g. no HTTP related utils were moved and so on).

This PR adds pkg/pools to help reuse buffers for reading and for writing.

The PR also adds buffering to archive writing, thus helping speed up archive writing and a few other operations.

Please focus on stress testing this with pulls, pushes, builds and anything you can think of.

LK4D4 · 2014-08-12T17:25:14Z

archive/archive.go

@@ -78,7 +78,9 @@ func xzDecompress(archive io.Reader) (io.ReadCloser, error) {
 }

 func DecompressStream(archive io.Reader) (io.ReadCloser, error) {
-	buf := bufio.NewReader(archive)
+	p := bufioReader512KPool
+	buf := p.Get().(*bufio.Reader)


Can we wrap pool in datastructure which will return *bufio.Reader and bufio.Writer directly?
Ah, and maybe does Reset.

LK4D4 · 2014-08-12T19:16:07Z

I think we can even write generic functions:

ReadBufferPool(size int) *BytePool
WriteBufferPool(size int) *BytePool

which returns pools for size. So, archive/pool.go should be something like pkg/bytepool.go because I'd definitely want to use this somewhere :)

jwilder · 2014-08-13T16:55:59Z

Tried this w/ docker push to a private repo w/ two images (391MB, 944MB). Slight performance improvement of 6-7% for both images.

391MB (40.8s -> 38.4s)
944MB (96.7s -> 90.2s)

ross-w · 2014-08-14T00:06:26Z

Not seeing a massive difference here (1min38s vs 1min47s):
https://gist.github.com/ross-w/43d37883fb0747696451

vieux · 2014-08-14T01:15:08Z

@tianon should we remove 1.2 retrocompat in travis ?

phemmer · 2014-08-14T23:24:00Z

I tested this with doing a docker load on a 1.3gb image:

2m58s -> 2m42s (average over several runs)

shin- · 2014-08-15T15:32:47Z

Tested, also tried upping the buffer size from 512 to 2048, but I haven't seen a significant difference with my test case.

unclejack · 2014-08-20T23:02:29Z

I've pushed an update. I've been able to pull fedora:latest from the Hub in 12 seconds with this PR. Pushing and pulling should both be faster.

This PR is now using the external gzip to decompress archives. This is only for testing right now, but we can improve and expand this to compression as well.

Please feel free to test again and provide results.

ross-w · 2014-08-20T23:32:34Z

Thanks @unclejack . That one shaved another 10s off (comparison between your last PR and the new one):
https://gist.github.com/ross-w/0ec92c98c81140c691c1

It's still only ~20s faster than the original though.

LK4D4 · 2014-08-21T03:08:12Z

pkg/pool/pool.go

+	"github.com/docker/docker/pkg/ioutils"
+)
+
+var BufioReader32KPool = sync.Pool{


Why not to do something like:

type ReaderPool struct { p sync.Pool } func NewReaderPool(size int) *ReaderPool { return &ReaderPool{ p: sync.Pool{ New: func() interface{} { return bufio.NewReaderSize(nil, size) }, }, } }

and let client decide what size is needed?

@LK4D4 We need these to be shared across packages and source files. I've considered doing something like that, but it's not going to help as much.

Then, pkg is not place for this I think. And without constructor this package is pretty hard to use.

Why is it hard to use? This is how this is done in other packages in the stdlib. This is merely a package which puts all of these 8K, 32K and 512K bufio.Readers/Writers in one place so they can be shared across packages.

I've already commented below to explain and show how this is used. If you take a look at the changes made to archive/diff.go and archive/changes.go, it should be obvious how easy to use these can be.

That being said, we can add convenience functions and functions which return wrapped bufio.Readers to return them to the pools automatically.

unclejack · 2014-08-21T07:48:49Z

@ross-w Could you describe your environment in the gist and provide the sizes of the layers, please?

The pull/push performance depends a lot on your disk IO both on the private registry and on your docker host, on your CPU and on your network. If your server doesn't have support for AES-NI, docker push might be a bit slow during the TarSum and that's most likely what's slowing down the push in your case.

The layers are already buffered to disk before a push, so there's isn't a big overhead there which could be slowing down the upload over HTTP like there is on pull.

Depending on the sizes of your images, it might also take a while because there is a lot of data to push.

Once I'm able to take a look at your environment's description, I can figure out what other improvements can be made to improve performance even further.

ross-w · 2014-08-21T10:16:11Z

@unclejack cat /proc/cpuinfo here:
https://gist.github.com/ross-w/4e31f6caf16b600061a4

"aes" flag is shown.

Network as mentioned is not a bottleneck. Our disks aren't the fastest, but as shown in the curl benchmarks given - we can curl the layers from the registry very quickly.

Push is also pretty slow, by the way.

This particular image does have a fair amount of data, as do most of our images (they're our production applications) - but as mentioned these layers can be transferred with curl extremely quickly. It's only once in docker it becomes slower.

unclejack · 2014-08-21T11:25:45Z

@ross-w Downloading the layers with curl doesn't extract them on the file system, it's simply downloading them and that's what makes it so fast. We can change the code so that it buffers them to disk before extraction, but that won't make the extraction any faster.

The extraction of the layers is the part which has some overhead because a lot of syscalls have to be made to create files, directories and a lot of files need to be written. This can be improved by having faster disks. Improving this extraction code will be difficult.

If the extraction phase is the really slow part of a pull for a particular system, buffering to disk will only make the download really fast, but it won't make the extraction phase that follows afterwards much faster. If downloading the layer takes 10 seconds at 1gbps and extracting it to disk takes 40 seconds, buffering the layer to disk first won't make it that much faster.

You can do some benchmarks by using docker save, docker load, docker export and docker import. This should give you an idea of how long it'll take when layers get buffered to disk first.

Adding an option to buffer layers to disk on pull is the next step.

thaJeztah · 2014-08-21T20:34:50Z

@unclejack would there be situations where layers are not directly needed and extracting can be performed lazily (on first access)?

For example, if a user does docker pull ubuntu, docker pulls all tags for the ubuntu repository. Chances are that not all tags are required directly. To speed up the user experience, docker could download all layers and write them to disk.

The first time a user performs a build using one of those images, docker will uncompress the layers required for that image.

This would have the additional advantage that unused layers/images would take up less space and are easier to clean up (only removing a single archive in stead of removing a whole directory-tree)

ross-w · 2014-08-21T22:17:38Z

@unclejack Sorry yes, every time I tell our story I leave more and more details out. I am aware Docker does more than just download files when pulling.

However, I have modified my script to do just that. After running the script I have the full container's filesystem in the current directory, and its run is significantly faster than docker pull:
https://gist.github.com/ross-w/83a677e17ab3ed8b31ac
~15s vs ~1min30s

If it were really our disks, should I be able to extract that fast? Or am I missing out a major step?

By the way, thank you for your efforts so far towards solving this - it is much appreciated (even if I might seem grumpy in these updates)

ross-w · 2014-08-21T22:45:23Z

@unclejack additional to the above comment, I've just done a test with /var/lib/docker on tmpfs and I get the same pull time - 1min33s.

It would appear our disks are not the bottleneck in our case.

unclejack · 2014-08-21T23:49:27Z

@ross-w Based on what you've written, I've decided to add buffering to disk for the layers on pull. That will help speed up pull.

ross-w · 2014-08-21T23:51:45Z

@unclejack thanks, let me know when you've got something we can test!

unclejack · 2014-08-24T17:58:57Z

@ross-w Please take a look at and test #7704.

LK4D4 · 2014-08-27T05:31:34Z

pkg/pools/pools.go

+	})
+}
+
+var BufioReader32KPool *bufioReaderPool


Can we move this to top?

LK4D4 · 2014-08-27T07:55:50Z

Here benchmark results compared to master:

benchmark             old ns/op     new ns/op     delta       
BenchmarkTarUntar     16610542      8518127       -48.72%     

benchmark             old MB/s     new MB/s     speedup     
BenchmarkTarUntar     0.02         0.05         2.50x       

benchmark             old allocs     new allocs     delta      
BenchmarkTarUntar     7487           7536           +0.65%     

benchmark             old bytes     new bytes     delta       
BenchmarkTarUntar     355268        425864        +19.87%

Here we see +20% allocated memory. Maybe we should write more precise benchmark for separate Tar and Untar. Maybe we'll see ways to improve this.

unclejack · 2014-08-27T15:59:00Z

I've updated this PR to make some of the discussed changes. I've also added detection for gzip to make sure we only attempt to shell out to gzip when it's present.

unclejack · 2014-09-02T18:53:29Z

@crosbymichael @LK4D4 @tiborvass @vieux This is ready for review. Please review and let me know what you think.

vieux · 2014-09-02T21:23:49Z

LGTM

LK4D4 · 2014-09-03T05:47:28Z

pkg/pools/pools.go

+	})
+}
+
+func init() {


Can we move this to top of file?

LK4D4 · 2014-09-03T06:10:30Z

Also here is new awesome benchmark results:

benchmark             old ns/op     new ns/op     delta       
BenchmarkTarUntar     17145885      7601314       -55.67%     

benchmark             old MB/s     new MB/s     speedup     
BenchmarkTarUntar     0.02         0.05         2.50x       

benchmark             old allocs     new allocs     delta      
BenchmarkTarUntar     7537           7487           -0.66%     

benchmark             old bytes     new bytes     delta       
BenchmarkTarUntar     425687        355816        -16.41%

unclejack · 2014-09-03T08:20:26Z

@LK4D4 PTAL, I've moved the init function.

Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)

LK4D4 · 2014-09-03T08:39:06Z

LGTM

Refactor archive buffering

LK4D4 reviewed Aug 12, 2014
View reviewed changes

dmp42 mentioned this pull request Aug 13, 2014

docker pull / push slow #7291

Closed

unclejack force-pushed the refactor_archive_buffering branch from 0d4c19e to d04d4bb Compare August 20, 2014 22:58

LK4D4 reviewed Aug 21, 2014
View reviewed changes

unclejack force-pushed the refactor_archive_buffering branch from d04d4bb to 4cbf0dc Compare August 21, 2014 08:08

dmp42 mentioned this pull request Aug 25, 2014

Parallelize Push and Pull #150

Closed

unclejack force-pushed the refactor_archive_buffering branch from 4cbf0dc to abb3ce3 Compare August 26, 2014 14:50

unclejack mentioned this pull request Aug 26, 2014

buffer layer to disk before registering in graph #7704

Closed

LK4D4 reviewed Aug 27, 2014
View reviewed changes

pkg/pools/pools.go

})

}

var BufioReader32KPool *bufioReaderPool

Copy link

Contributor

LK4D4 Aug 27, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to top?

unclejack force-pushed the refactor_archive_buffering branch from abb3ce3 to 8c2fb68 Compare August 27, 2014 15:56

unclejack force-pushed the refactor_archive_buffering branch 4 times, most recently from 11696ab to 6bf3b8d Compare September 2, 2014 18:33

LK4D4 reviewed Sep 3, 2014
View reviewed changes

pkg/pools/pools.go

})

}

func init() {

Copy link

Contributor

LK4D4 Sep 3, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to top of file?

unclejack force-pushed the refactor_archive_buffering branch from 6bf3b8d to d76b9c0 Compare September 3, 2014 08:19

unclejack added 5 commits September 3, 2014 11:36

move some io related utils to pkg/ioutils

7621263

Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)

pkg/ioutils: add WriteCloserWrapper

7b6f340

Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)

pkg/ioutils: add NewBufReaderWithDrainbufAndBuffer

c93b9e8

Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)

pkg/ioutils: add ReaderErrWrapper to readers

bd130e7

Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)

pkg/pool: add pools for bufio readers & writers

2d2016b

Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)

unclejack force-pushed the refactor_archive_buffering branch from d76b9c0 to 84d76e5 Compare September 3, 2014 08:36

archive: use pooled bufio readers and writers

84d76e5

Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)

LK4D4 added a commit that referenced this pull request Sep 3, 2014

Merge pull request #7542 from unclejack/refactor_archive_buffering

58dc474

Refactor archive buffering

LK4D4 merged commit 58dc474 into moby:master Sep 3, 2014

bacongobbler mentioned this pull request Sep 12, 2014

ref(all): Flatten Dockerfiles deis/deis#1825

Closed

unclejack deleted the refactor_archive_buffering branch September 16, 2014 22:22

phemmer mentioned this pull request Jan 19, 2015

Gzip performance #10181

Closed

mtrmac mentioned this pull request Aug 31, 2021

Avoid rehashing and noop compression writer containers/buildah#3487

Merged

This was referenced Jan 9, 2025

pkg/ioutils: deprecate unused types and functions #49244

Merged

[27.x backport] pkg/ioutils: deprecate unused types and functions #49246

Merged

mtrmac mentioned this pull request Mar 10, 2025

Write tar fixes containers/storage#2261

Merged

Refactor archive buffering #7542

Refactor archive buffering #7542

Uh oh!

Conversation

unclejack commented Aug 12, 2014

Uh oh!

LK4D4 Aug 12, 2014

Choose a reason for hiding this comment

Uh oh!

LK4D4 commented Aug 12, 2014

Uh oh!

jwilder commented Aug 13, 2014

Uh oh!

ross-w commented Aug 14, 2014

Uh oh!

vieux commented Aug 14, 2014

Uh oh!

phemmer commented Aug 14, 2014

Uh oh!

shin- commented Aug 15, 2014

Uh oh!

unclejack commented Aug 20, 2014

Uh oh!

ross-w commented Aug 20, 2014

Uh oh!

LK4D4 Aug 21, 2014

Choose a reason for hiding this comment

Uh oh!

unclejack Aug 21, 2014

Choose a reason for hiding this comment

Uh oh!

LK4D4 Aug 21, 2014

Choose a reason for hiding this comment

Uh oh!

unclejack Aug 21, 2014

Choose a reason for hiding this comment

Uh oh!

unclejack commented Aug 21, 2014

Uh oh!

ross-w commented Aug 21, 2014

Uh oh!

unclejack commented Aug 21, 2014

Uh oh!

thaJeztah commented Aug 21, 2014

Uh oh!

ross-w commented Aug 21, 2014

Uh oh!

ross-w commented Aug 21, 2014

Uh oh!

unclejack commented Aug 21, 2014

Uh oh!

ross-w commented Aug 21, 2014

Uh oh!

unclejack commented Aug 24, 2014

Uh oh!

LK4D4 Aug 27, 2014

Choose a reason for hiding this comment

Uh oh!

LK4D4 commented Aug 27, 2014

Uh oh!

unclejack commented Aug 27, 2014

Uh oh!

unclejack commented Sep 2, 2014

Uh oh!

vieux commented Sep 2, 2014

Uh oh!

LK4D4 Sep 3, 2014

Choose a reason for hiding this comment

Uh oh!

LK4D4 commented Sep 3, 2014

Uh oh!

unclejack commented Sep 3, 2014

Uh oh!

LK4D4 commented Sep 3, 2014

Uh oh!

Uh oh!