-
Notifications
You must be signed in to change notification settings - Fork 696
Description
Description
In the past week(s) or so, the CI has been increasingly unstable.
While the root cause is unclear, I suspect github networking is somewhat under pressure, or maybe we are in a tier that is somehow throttled?
Specifically:
- ubuntu server are timeouting a lot (unable to connect): https://github.com/containerd/nerdctl/actions/runs/13511118022/job/37751406679?pr=3920#step:7:2624 (never happened before)
- golangci fails to reach its server to retrieve their schema: https://github.com/containerd/nerdctl/actions/runs/13476112452/job/37655514638?pr=3890#step:5:38 (never happened before)
- github cache retrieval has been increasingly slugish: instead of mere seconds to retrieve the manifest, it can now take up to 10 minutes
- docker hub increasingly times-out or give us the dreaded 429 (somewhat orthogonal to the rest)
Maybe these are all related (eg: github networking degraded) - or maybe they are not, and it is a coincidence.
One way or the other, we are getting to the point where it is hard to get a green build on first try (on top of our test flakyness, which is a long fought battle on its own).
While some easy/localized actions can be taken to reduce outbound traffic for routine operations (#3915), I only see two possibilities moving forward:
a. we get some help / information from github about network quality
b. we take a serious hard look at how we do things and significantly reduce our outbound dependencies
I do not have insider contact for a. Does anyone have some?
For b.:
- we could consider getting rid entirely of reliance on Docker Hub and host everything we need on ghcr instead (on the assumption it will be better), and systematically hunt down and remove unneeded outbound traffic (eg: golangci for eg)
- we rethink both the way we build and use our base image and the way we use github cache, which has been a growing PITA (quota, slugishness)
Do people have thoughts about all this?
Steps to reproduce the issue
Describe the results you received and expected
na
What version of nerdctl are you using?
main
Are you using a variant of nerdctl? (e.g., Rancher Desktop)
None
Host information
No response