-
Notifications
You must be signed in to change notification settings - Fork 102
Description
As discussed in this comment (with thanks to @shizhMSFT for shepherding it through).
Summary
- Request: Add support for ongoing progress updates when running
oras.Copy()
- Context: if the downloaded blobs/manifests are large (and sometimes even if not), being able to know status updates, and that things are not stuck, is very useful
- Boundaries: this should not be concerned with UI or other aspects, except, potentially, for oras CLI, and that is out of scope at this time. The entire interface should be one that lets updates be sent by code to some consumer, which handles UI (if any)
Proposed design
The signature for oras.Copy includes oras.CopyOptions as the last parameter. If this were variadic, I would suggest adding another WithProgressUpdate()
or similar, but we use a single CopyOptions
, so I propose adding another property: CopyOptions.Progress
. The type depends on the design choice.
There are two ways to do this:
- function:
CopyOptions.Progress func(Update)
. With each update,Copy()
(or its underlying functions) would call the passed function, passing it theUpdate
(see below). - channel:
CopyOptions.Progress chan<- Update
. With each updateCopy()
would send anUpdate
to the channel
If Progress
is nil, then this functionality is bypassed.
Despite my linked PR using channels, I have no strong preference for channel over function.
Preventing Blocking
With both functions and channels, there is a concern that it might block. In principle, I think that if someone calls Copy()
and passes it a blocking function or unbuffered channel, that is their issue. However, it can cause us headaches to support them.
I would consider having each call that sends to the channel or calls func()
to be in a separate short-lived goroutine, which calls fund or sends to the channel, wrapped in a timeout.
Frequency of Update
My initial design is to rely on the underlying io.CopyN()
. Whatever we use for that, we use for writing updates. However, that can overwhelm if the default io.Copy()
is used. If I recall correctly, io.Copy()
defaults to 32K. With a 100MB blob, that is ~3000 updates. That may or may not be good.
However we control the update frequency, I think it should be byte-based, not time-based. I.e. "updates every x KB" instead of "updates every y seconds." That is more useful, and also easier to implement.
In terms of controlling the update frequency, the simplest way is CopyOption.ProgresssFrequency uint
. If it is 0, stick to the default.
An alternative is to have CopyOption.Progress
be a struct with both the channel/func (whichever is chosen) and an update frequency property.
A third method - and probably the simplest - is not to control it at all, but rather have it be part of CopyOption.Progress
. Our call Copy()
calls that / sends to channel, and it buffers as often as it wants. This is the simplest, but is subject to making our "blocking control", i.e. goroutines, being overwhelmed.
Open to ideas.
Structure of update message
The oras.Update
should be simple and contain only 2 properties:
type Update struct {
Copied int64
Descriptor descriptor
}
The descriptor is important for 2 reasons:
- To know the total expected size (which lets the consumer calculate the percentage complete)
- To know what downloading blob/manifest this is for. The func can be called / channel can receive messages multiple times, even in parallel. This lets the consumer know exactly what the update is for.
Sample
Channel:
ch := make(chan oras.Update, 1000)
opts := oras.CopyOptions{
ProgressChannel: ch
}
desc, err := oras.Copy(ctx, src, tagName, dst, tagName, opts)
go func(ch <- chan CopyUpdate) {
for msg := range ch {
fmt.Printf("copied %d bytes out of %d total for %s\n", msg.Copied, msg.Descriptor.Size, msg.Descriptor.Digest)
}
}(ch)
Func:
f := func(msg oras.Update) {
fmt.Printf("copied %d bytes out of %d total for %s\n", msg.Copied, msg.Descriptor.Size, msg.Descriptor.Digest)
}
opts := oras.CopyOptions{
ProgressHandler: f
}
desc, err := oras.Copy(ctx, src, tagName, dst, tagName, opts)