-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
Based on the distribution-spec which now contains details on references between digests within a repository from the reference types working group, I thought I'd write down some thoughts on how we could implement that in this project.
I wanted to outline how I thought this could work before starting any work on implementation.
There is already some work on an implementation here that takes an extension approach: https://github.com/oci-playground/distribution/
My proposal:
Handling manifests with refers fields
Manifests with refers fields are processed by the registry when they are accepted. This would mean the registry needs to be aware of the new application/vnd.oci.artifact.manifest.v1+json
media type and the addition of the refers
field to the image spec.
When a manifest referring to subjects is uploaded, we add a link file for every reference. They are stored by type because it is likely that clients will be interested in a specific type of reference and will be filtering the API calls based on type:
v2/repositories/<name>/_manifests/revisions/sha256/<subject>/_referrers/_<artifactType>/sha256/<referer>/link
The artifact type is prefixed with _
because the artifact type may be missing, and a consistent prefix stops there being an empty path segment. To cope with backend storage restrictions, the artifact type in the path will be encoded using url encoding.
If the subject digest does not exist yet, we still create the folder structure to create the referrer link. We do not create a link file to the blob store where the content of the subject's manifest is held, because we do not have the content. This link file will be created if/when the subject is created later:
v2/repositories/<name>/_manifests/revisions/sha256/<subject>/link
When a manifest referring to other digests is deleted (API or garbage collection), we perform the same in reverse and clean up references from the filesystem.
Deletion of subjects
When a manifest revision is deleted (API or garbage collection) that has _referrers
, those _referrers
could be checked. If they are untagged and all their references do not exist in the cluster they could be deleted too. It may be better to leave this to garbage collection to avoid extended work when deleting manifests from the registry.
Consistency and bootstrapping
Add a new command line option clean-references
, which can go through the registry, delete any dangling _referrers
that point to artifacts that are not in the registry, and create any missing references by processing all manifests for refers
fields.
Upgrade
On upgrade of a registry instance to a version that includes references, we should run the bootstrap. This could be left to the user, or we could have an identifier in the filesystem used to trigger a bootstrap at startup (eg, if v2/_references_bootstrapped
does not exist, bootstrap at startup and create it).
Garbage collection
When garbage collecting, the subjects of an object being garbage collected need to be checked for existence. If at least one of them exists, the manifest should not be garbage collected.
New API
As documented by the reference types spec, we would add the referrers API. Results would be ordered by type and then digest based on the filesystem layout, which would be walked. This provides for consistent ordering and paging.