Skip to content

Conversation

sudo-bmitch
Copy link
Contributor

@sudo-bmitch sudo-bmitch commented Aug 11, 2024

Fixes issue

Copying images has a delayed start and regclient throttles itself even on a local network.

Describe the change

This is a significant refactor of the networking stack to remove the default rate limit and reorder the copy of layers to speed up the start of the copy and prioritize the copy of large layers to reduce the typical time to copy an image.

  • Update scheme to use pqueue instead of throttle. This is a breaking change but library users are unlikely to encounter issues with the scheme change since both throttle and pqueue are internal packages, so it shouldn't be possible to call their methods directly or specify their types in variables.
  • Remove multiple API support. This removes an undocumented API for deleting images from Hub. Users that depend on that functionality should consider a Docker Hub alternative like hub-tool.
  • Remove digest calculation from reghttp. This is being done directly by blob and manifest types.
  • Remove reghttp.Resp interface. Replace it with a pointer to a struct with private values.
  • Remove time.Ticker for rate limiting. time.Ticker leaks goroutines. A simple time for the next request is good enough.
  • Cleanup reghttp.Resp methods:
    • Do not export next method.
    • Document exported methods.
    • Arbitrary seek is allowed since digest is not calculated here.
  • Configure priority queue algorithm and reorder image copy steps:
    • Priority queue algorithm prefers a small entry (non-blob API), and 50/50 split of largest and oldest queued entries.
    • Image reorder starts the blob copies sooner to avoid blocking on the tag listing for referrers or digest tags.
    • pqueue.AcquireMulti releases queues in reverse order to minimize risk of an acquire blocked by a soon to release queue.
    • Include type and size in the request for priority queue.
    • Use expected request size to validate response.
  • Move logging into transport and rework backoff:
    • Logging in transport allows better debugging of each request, including redirects.
    • Backoff redesign better handles failed requests and slows down all requests.
  • Remove default rate limit.

How to verify it

The copy command should start faster and copy large blobs earlier in the process to maximize concurrency:

regctl image copy $src $tgt

Changelog text

  • Feat: Significant refactor of http APIs to speed up image copies.
  • Feat: Add a priority queue for network requests.
  • Breaking: Update scheme to use pqueue instead of throttle.
  • Breaking: Removes an undocumented API for deleting images from Hub.
  • Refactor: Remove digest calculation from reghttp.
  • Feat: Add priority queue algorithm and reorder image copy steps.
  • Feat: Move logging into transport and rework backoff.
  • Feat: Remove default rate limit.

Please verify and check that the pull request fulfills the following requirements

  • Tests have been added or not applicable
  • Documentation has been added, updated, or not applicable
  • Changes have been rebased to main
  • Multiple commits to the same code have been squashed

@sudo-bmitch sudo-bmitch marked this pull request as draft August 11, 2024 14:12
Breaking: Update scheme to use pqueue instead of throttle.

This is the first step to redesign the network requests for more efficiency.

Library users are unlikely to encounter issues with the scheme change since
both throttle and pqueue are internal packages, so it shouldn't be possible
to call their methods directly or specify their types in variables.

Signed-off-by: Brandon Mitchell <git@bmitch.net>
Breaking: This removes an undocumented API for deleting images from Hub.
Users that depend on that functionality should consider a Docker Hub alternative like hub-tool.

Signed-off-by: Brandon Mitchell <git@bmitch.net>
This is being done directly by blob and manifest types.

Signed-off-by: Brandon Mitchell <git@bmitch.net>
Replace it with a pointer to a struct with private values.

Signed-off-by: Brandon Mitchell <git@bmitch.net>
time.Ticker leaks goroutines.
A simple time for the next request is good enough.

Signed-off-by: Brandon Mitchell <git@bmitch.net>
- Do not export next method.
- Document exported methods.
- Arbitrary seek is allowed since digest is not calculated here.

Signed-off-by: Brandon Mitchell <git@bmitch.net>
- Priority queue algorithm prefers a small entry (non-blob API), and 50/50 split of largest and oldest queued entries.
- Image reorder starts the blob copies sooner to avoid blocking on the tag listing for referrers or digest tags.
- pqueue.AcquireMulti releases queues in reverse order to minimize risk of an acquire blocked by a soon to release queue.
- Include type and size in the request for priority queue.
- Use expected request size to validate response.

Signed-off-by: Brandon Mitchell <git@bmitch.net>
Signed-off-by: Brandon Mitchell <git@bmitch.net>
- Logging in transport allows better debugging of each request, including redirects
- Backoff redesign better handles failed requests and slows down all requests
- Default rate limit is now removed

Signed-off-by: Brandon Mitchell <git@bmitch.net>
@sudo-bmitch sudo-bmitch marked this pull request as ready for review August 17, 2024 20:52
@sudo-bmitch sudo-bmitch merged commit 983d0a0 into regclient:main Aug 17, 2024
5 checks passed
@sudo-bmitch sudo-bmitch deleted the pr-reghttp-refactor branch August 17, 2024 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant