Skip to content

Conversation

unclejack
Copy link
Contributor

This PR refactors archive to add buffering to help fix #7291. This PR is the first one of the two PRs required to fix #7291.

The first commit (move some io related utils to pkg/ioutils) moves some Readers and Writers to pkg/ioutils. I've moved these first so that I would avoid adding WriteCloseWrapper to utils/utils.go. I've moved only the io utils which had the least dependencies on other things (e.g. no HTTP related utils were moved and so on).

This PR adds pkg/pools to help reuse buffers for reading and for writing.

The PR also adds buffering to archive writing, thus helping speed up archive writing and a few other operations.

Please focus on stress testing this with pulls, pushes, builds and anything you can think of.

@@ -78,7 +78,9 @@ func xzDecompress(archive io.Reader) (io.ReadCloser, error) {
}

func DecompressStream(archive io.Reader) (io.ReadCloser, error) {
buf := bufio.NewReader(archive)
p := bufioReader512KPool
buf := p.Get().(*bufio.Reader)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we wrap pool in datastructure which will return *bufio.Reader and bufio.Writer directly?
Ah, and maybe does Reset.

@LK4D4
Copy link
Contributor

LK4D4 commented Aug 12, 2014

I think we can even write generic functions:

ReadBufferPool(size int) *BytePool
WriteBufferPool(size int) *BytePool

which returns pools for size. So, archive/pool.go should be something like pkg/bytepool.go because I'd definitely want to use this somewhere :)

@dmp42 dmp42 mentioned this pull request Aug 13, 2014
@jwilder
Copy link

jwilder commented Aug 13, 2014

Tried this w/ docker push to a private repo w/ two images (391MB, 944MB). Slight performance improvement of 6-7% for both images.

391MB (40.8s -> 38.4s)
944MB (96.7s -> 90.2s)

@ross-w
Copy link

ross-w commented Aug 14, 2014

Not seeing a massive difference here (1min38s vs 1min47s):
https://gist.github.com/ross-w/43d37883fb0747696451

@vieux
Copy link
Contributor

vieux commented Aug 14, 2014

@tianon should we remove 1.2 retrocompat in travis ?

@phemmer
Copy link
Contributor

phemmer commented Aug 14, 2014

I tested this with doing a docker load on a 1.3gb image:

2m58s -> 2m42s (average over several runs)

@shin-
Copy link
Contributor

shin- commented Aug 15, 2014

Tested, also tried upping the buffer size from 512 to 2048, but I haven't seen a significant difference with my test case.

@unclejack unclejack force-pushed the refactor_archive_buffering branch from 0d4c19e to d04d4bb Compare August 20, 2014 22:58
@unclejack
Copy link
Contributor Author

I've pushed an update. I've been able to pull fedora:latest from the Hub in 12 seconds with this PR. Pushing and pulling should both be faster.

This PR is now using the external gzip to decompress archives. This is only for testing right now, but we can improve and expand this to compression as well.

Please feel free to test again and provide results.

@ross-w
Copy link

ross-w commented Aug 20, 2014

Thanks @unclejack . That one shaved another 10s off (comparison between your last PR and the new one):
https://gist.github.com/ross-w/0ec92c98c81140c691c1

It's still only ~20s faster than the original though.

"github.com/docker/docker/pkg/ioutils"
)

var BufioReader32KPool = sync.Pool{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not to do something like:

type ReaderPool struct {
    p sync.Pool
}

func NewReaderPool(size int) *ReaderPool {
    return &ReaderPool{
        p: sync.Pool{
            New: func() interface{} { return bufio.NewReaderSize(nil, size) },
        },
    }
}

and let client decide what size is needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LK4D4 We need these to be shared across packages and source files. I've considered doing something like that, but it's not going to help as much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, pkg is not place for this I think. And without constructor this package is pretty hard to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it hard to use? This is how this is done in other packages in the stdlib. This is merely a package which puts all of these 8K, 32K and 512K bufio.Readers/Writers in one place so they can be shared across packages.

I've already commented below to explain and show how this is used. If you take a look at the changes made to archive/diff.go and archive/changes.go, it should be obvious how easy to use these can be.

That being said, we can add convenience functions and functions which return wrapped bufio.Readers to return them to the pools automatically.

@unclejack
Copy link
Contributor Author

@ross-w Could you describe your environment in the gist and provide the sizes of the layers, please?

The pull/push performance depends a lot on your disk IO both on the private registry and on your docker host, on your CPU and on your network. If your server doesn't have support for AES-NI, docker push might be a bit slow during the TarSum and that's most likely what's slowing down the push in your case.

The layers are already buffered to disk before a push, so there's isn't a big overhead there which could be slowing down the upload over HTTP like there is on pull.

Depending on the sizes of your images, it might also take a while because there is a lot of data to push.

Once I'm able to take a look at your environment's description, I can figure out what other improvements can be made to improve performance even further.

@unclejack unclejack force-pushed the refactor_archive_buffering branch from d04d4bb to 4cbf0dc Compare August 21, 2014 08:08
@ross-w
Copy link

ross-w commented Aug 21, 2014

@unclejack cat /proc/cpuinfo here:
https://gist.github.com/ross-w/4e31f6caf16b600061a4

"aes" flag is shown.

Network as mentioned is not a bottleneck. Our disks aren't the fastest, but as shown in the curl benchmarks given - we can curl the layers from the registry very quickly.

Push is also pretty slow, by the way.

This particular image does have a fair amount of data, as do most of our images (they're our production applications) - but as mentioned these layers can be transferred with curl extremely quickly. It's only once in docker it becomes slower.

@unclejack
Copy link
Contributor Author

@ross-w Downloading the layers with curl doesn't extract them on the file system, it's simply downloading them and that's what makes it so fast. We can change the code so that it buffers them to disk before extraction, but that won't make the extraction any faster.

The extraction of the layers is the part which has some overhead because a lot of syscalls have to be made to create files, directories and a lot of files need to be written. This can be improved by having faster disks. Improving this extraction code will be difficult.

If the extraction phase is the really slow part of a pull for a particular system, buffering to disk will only make the download really fast, but it won't make the extraction phase that follows afterwards much faster. If downloading the layer takes 10 seconds at 1gbps and extracting it to disk takes 40 seconds, buffering the layer to disk first won't make it that much faster.

You can do some benchmarks by using docker save, docker load, docker export and docker import. This should give you an idea of how long it'll take when layers get buffered to disk first.

Adding an option to buffer layers to disk on pull is the next step.

@thaJeztah
Copy link
Member

@unclejack would there be situations where layers are not directly needed and extracting can be performed lazily (on first access)?

For example, if a user does docker pull ubuntu, docker pulls all tags for the ubuntu repository. Chances are that not all tags are required directly. To speed up the user experience, docker could download all layers and write them to disk.

The first time a user performs a build using one of those images, docker will uncompress the layers required for that image.

This would have the additional advantage that unused layers/images would take up less space and are easier to clean up (only removing a single archive in stead of removing a whole directory-tree)

@ross-w
Copy link

ross-w commented Aug 21, 2014

@unclejack Sorry yes, every time I tell our story I leave more and more details out. I am aware Docker does more than just download files when pulling.

However, I have modified my script to do just that. After running the script I have the full container's filesystem in the current directory, and its run is significantly faster than docker pull:
https://gist.github.com/ross-w/83a677e17ab3ed8b31ac
~15s vs ~1min30s

If it were really our disks, should I be able to extract that fast? Or am I missing out a major step?

By the way, thank you for your efforts so far towards solving this - it is much appreciated (even if I might seem grumpy in these updates)

@ross-w
Copy link

ross-w commented Aug 21, 2014

@unclejack additional to the above comment, I've just done a test with /var/lib/docker on tmpfs and I get the same pull time - 1min33s.

It would appear our disks are not the bottleneck in our case.

@unclejack
Copy link
Contributor Author

@ross-w Based on what you've written, I've decided to add buffering to disk for the layers on pull. That will help speed up pull.

@ross-w
Copy link

ross-w commented Aug 21, 2014

@unclejack thanks, let me know when you've got something we can test!

@unclejack
Copy link
Contributor Author

@ross-w Please take a look at and test #7704.

})
}

var BufioReader32KPool *bufioReaderPool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to top?

@LK4D4
Copy link
Contributor

LK4D4 commented Aug 27, 2014

Here benchmark results compared to master:

benchmark             old ns/op     new ns/op     delta       
BenchmarkTarUntar     16610542      8518127       -48.72%     

benchmark             old MB/s     new MB/s     speedup     
BenchmarkTarUntar     0.02         0.05         2.50x       

benchmark             old allocs     new allocs     delta      
BenchmarkTarUntar     7487           7536           +0.65%     

benchmark             old bytes     new bytes     delta       
BenchmarkTarUntar     355268        425864        +19.87%     

Here we see +20% allocated memory. Maybe we should write more precise benchmark for separate Tar and Untar. Maybe we'll see ways to improve this.

@unclejack unclejack force-pushed the refactor_archive_buffering branch from abb3ce3 to 8c2fb68 Compare August 27, 2014 15:56
@unclejack
Copy link
Contributor Author

I've updated this PR to make some of the discussed changes. I've also added detection for gzip to make sure we only attempt to shell out to gzip when it's present.

@unclejack unclejack force-pushed the refactor_archive_buffering branch 4 times, most recently from 11696ab to 6bf3b8d Compare September 2, 2014 18:33
@unclejack
Copy link
Contributor Author

@crosbymichael @LK4D4 @tiborvass @vieux This is ready for review. Please review and let me know what you think.

@vieux
Copy link
Contributor

vieux commented Sep 2, 2014

LGTM

})
}

func init() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to top of file?

@LK4D4
Copy link
Contributor

LK4D4 commented Sep 3, 2014

Also here is new awesome benchmark results:

benchmark             old ns/op     new ns/op     delta       
BenchmarkTarUntar     17145885      7601314       -55.67%     

benchmark             old MB/s     new MB/s     speedup     
BenchmarkTarUntar     0.02         0.05         2.50x       

benchmark             old allocs     new allocs     delta      
BenchmarkTarUntar     7537           7487           -0.66%     

benchmark             old bytes     new bytes     delta       
BenchmarkTarUntar     425687        355816        -16.41%     

@unclejack unclejack force-pushed the refactor_archive_buffering branch from 6bf3b8d to d76b9c0 Compare September 3, 2014 08:19
@unclejack
Copy link
Contributor Author

@LK4D4 PTAL, I've moved the init function.

Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)
Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)
Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)
Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)
Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)
@unclejack unclejack force-pushed the refactor_archive_buffering branch from d76b9c0 to 84d76e5 Compare September 3, 2014 08:36
Docker-DCO-1.1-Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com> (github: unclejack)
@LK4D4
Copy link
Contributor

LK4D4 commented Sep 3, 2014

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

docker pull / push slow
8 participants