Skip to content

Registry keeps crashing after starting "purgeuploads.go" #4358

@MXGong

Description

@MXGong

Description

We are building docker registry in ECS with Fargate, s3 as backend storage.

registry container keeps crashing after log shows "PurgeUploads starting"

Related debug log:

===========

Timestamp (UTC+10:00) Message Container
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage.PurgeUploads({0x1a383a0, 0xc000718780}, {0x1a403b8?, 0xc000708950?}, {0xc1887a9faa43c1bf, 0xfffddcd46f22bfb4, 0x24f3e60}, 0x1) registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage/purgeuploads.go:34 +0x12d registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/handlers.startUploadPurger.func1() registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/handlers/app.go:1032 +0x33f registry
20 May 2024 at 11:55 (UTC+10:00) created by github.com/distribution/distribution/v3/registry/handlers.startUploadPurger in goroutine 1 registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/handlers/app.go:1020 +0x329 registry
20 May 2024 at 11:55 (UTC+10:00) panic: runtime error: index out of range [0] with length 0 registry
20 May 2024 at 11:55 (UTC+10:00) goroutine 21 [running]: registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage.getOutstandingUploads.func1({0x1a384b8, 0xc001862c80}) registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage/purgeuploads.go:73 +0x54c registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage/driver/s3-aws.(*driver).doWalk.func1(0xc00015cc60, 0x0?) registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage/driver/s3-aws/s3.go:1159 +0x45c registry
20 May 2024 at 11:55 (UTC+10:00) github.com/aws/aws-sdk-go/service/s3.(*S3).ListObjectsV2PagesWithContext(0xc00049a690, {0x1a38870?, 0xc00046b570}, 0xc00046b500, 0xc001563a18, {0x0, 0x0, 0x0}) registry
20 May 2024 at 11:55 (UTC+10:00) github.com/aws/aws-sdk-go@v1.48.10/service/s3/api.go:7629 +0x1d0 registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage/driver/s3-aws.(*driver).doWalk(0xc0004bb170, {0x1a38870, 0xc00046b490}, 0xc000583ac0, {0xc00009a660?, 0x20?}, {0x0, 0x0}, 0xc00074e480) registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage/driver/s3-aws/s3.go:1123 +0x447 registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage/driver/s3-aws.(*driver).Walk(0xc00009a660?, {0x1a38870, 0xc00046b490}, {0xc00009a660, 0x20}, 0x155a420?, {0x0, 0x0, 0x7779f8?}) registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage/driver/s3-aws/s3.go:1077 +0xdc registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage/driver/base.(*Base).Walk(0xc000708950, {0x1a383a0?, 0xc000718780?}, {0xc00009a660, 0x20}, 0x410885?, {0x0, 0x0, 0x0}) registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage/driver/base/base.go:236 +0x27f registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage.getOutstandingUploads({0x1a383a0, 0xc000718780}, {0x1a403b8?, 0xc000708950}) registry
20 May 2024 at 11:55 (UTC+10:00) github.com/distribution/distribution/v3/registry/storage/purgeuploads.go:70 +0x202 registry
20 May 2024 at 11:55 (UTC+10:00) time="2024-05-20T01:55:18.997854099Z" level=debug msg="s3aws.ListObjectsV2PagesWithContext({\n Bucket: "s3-backend-storage",\n MaxKeys: 1000,\n Prefix: "docker/registry/v2/repositories/",\n StartAfter: ""\n})" go.version=go1.21.5 instance.id=03bd67a0-3f45-446a-8b0b-bb33c8c1e548 service=registry trace.duration=8.288609547s trace.file=github.com/distribution/distribution/v3/registry/storage/driver/s3-aws/s3.go trace.func="github.com/distribution/distribution/v3/registry/storage/driver/s3-aws.(*driver).doWalk" trace.id=be2f0c8a-e911-4767-bf4a-cde149409695 trace.line=1111 trace.parent.id=64255554-37c4-4dee-aaf1-9aa693390ded version=3.0.0-alpha.1 registry
20 May 2024 at 11:55 (UTC+10:00) time="2024-05-20T01:55:18.997899441Z" level=debug msg="s3aws.Walk("/docker/registry/v2/repositories")" go.version=go1.21.5 instance.id=03bd67a0-3f45-446a-8b0b-bb33c8c1e548 service=registry trace.duration=8.288706088s trace.file=github.com/distribution/distribution/v3/registry/storage/driver/base/base.go trace.func="github.com/distribution/distribution/v3/registry/storage/driver/base.(*Base).Walk" trace.id=64255554-37c4-4dee-aaf1-9aa693390ded trace.line=229 version=3.0.0-alpha.1 registry
20 May 2024 at 11:55 (UTC+10:00) 2024/05/20 01:55:11 traces export: Post "https://localhost:4318/v1/traces": dial tcp 127.0.0.1:4318: connect: connection refused registry
20 May 2024 at 11:55 (UTC+10:00) time="2024-05-20T01:55:10.826146919Z" level=debug msg="authorizing request" go.version=go1.21.5 http.request.host="localhost:5000" http.request.id=0be37451-4e52-4594-bb40-e6b15de5988c http.request.method=GET http.request.remoteaddr="127.0.0.1:35320" http.request.uri=/v2/ http.request.useragent=Wget instance.id=03bd67a0-3f45-446a-8b0b-bb33c8c1e548 service=registry version=3.0.0-alpha.1 registry
20 May 2024 at 11:55 (UTC+10:00) time="2024-05-20T01:55:10.826198132Z" level=info msg="response completed" go.version=go1.21.5 http.request.host="localhost:5000" http.request.id=0be37451-4e52-4594-bb40-e6b15de5988c http.request.method=GET http.request.remoteaddr="127.0.0.1:35320" http.request.uri=/v2/ http.request.useragent=Wget http.response.contenttype=application/json http.response.duration="108.405µs" http.response.status=200 http.response.written=2 instance.id=03bd67a0-3f45-446a-8b0b-bb33c8c1e548 service=registry version=3.0.0-alpha.1 registry
20 May 2024 at 11:55 (UTC+10:00) 127.0.0.1 - - [20/May/2024:01:55:10 +0000] "GET /v2/ HTTP/1.1" 200 2 "" "Wget" registry
20 May 2024 at 11:55 (UTC+10:00) time="2024-05-20T01:55:10.709137946Z" level=info msg="PurgeUploads starting: olderThan=2024-05-13 01:55:10.709083583 +0000 UTC m=-601619.974406220, actuallyDelete=true" registry
20 May 2024 at 11:55 (UTC+10:00) time="2024-05-20T01:55:06.864000795Z" level=info msg="response completed" go.version=go1.21.5 http.request.host="192.168.12.41:5000" http.request.id=0f79f86a-f3c1-4e7a-aa97-7c47270aac43 http.request.method=GET http.request.remoteaddr="192.168.11.226:12120" http.request.uri=/v2/ http.request.useragent=ELB-HealthChecker/2.0 http.response.contenttype=application/json http.response.duration="103.999µs" http.response.status=200 http.response.written=2 instance.id=03bd67a0-3f45-446a-8b0b-bb33c8c1e548 service=registry version=3.0.0-alpha.1 registry
20 May 2024 at 11:55 (UTC+10:00) 192.168.11.226 - - [20/May/2024:01:55:06 +0000] "GET /v2/ HTTP/1.1" 200 2 "" "ELB-HealthChecker/2.0"

Version: 3.0.0-alpha.1

Reproduce

  1. ECS starts Task, docker registry container,
  2. passed container healthcheck
  3. passed ELB healthcheck
  4. pull image is working fine
  5. crashed after log shows starting "purgeuploads.go"
  6. ECS Task starts a new task, repeat 1-5

crash happens every ~ 50 minutes

Expected behavior

The docker registry should be able to running without crash

registry version

Version: 3.0.0-alpha.1

Additional Info

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions