Skip to content

Data corruption in Azure storage driver when using Writer() interface #4571

@vespian

Description

@vespian

Description

This is a followup for the azure-sdk-go bug report here. The TLDR is:

When uploading data using the Azure SDK's NewAppendBlobClient.AppendBlock API, data corruption occurs if timeout occurs. Specifically, when a chunk upload fails with "500 Operation timeout", the Azure driver code does not take into account that operation might have succded as per Azure API docs, and just retries the upload resulting in duplicate blocks.

Reproduce

This issue usually occurs when throttling is in effect due to excessive number of requests.

Expected behavior

The code in this situation should:

  • Use AppendPositionAccessConditions in AppendBlockOptions
  • If you get a 412 (precondition not met) error, it means the previous attempt succeeded
  • If not - download the range of the last append to verify if the block contains the expected data
  • Continue with the next block if verification succeeds, otherwise retry the current block

registry version

latest master : 51bdcb7

Additional Info

Fix was implemented here: https://gitlab.com/gitlab-org/container-registry/-/merge_requests/2059/diffs?commit_id=959132477ef719249270b87ce2a7a05abcd6e1ed

Metadata

Metadata

Assignees

No one assigned

    Labels

    priority/P1Major item. Should definitely have it.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions