Skip to content

store: use canonical image mapping when mounting #2285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

giuseppe
Copy link
Member

@giuseppe giuseppe commented Mar 17, 2025

Ensure that a mapped layer with the canonical mapping is created if one doesn't already exist. Previously, any layer was used regardless of its mapping, which resulted in the mounted image having incorrect file ownership, depending on what mapping was found.

@giuseppe giuseppe force-pushed the store-images-without-mappings branch from e0af89b to 6287869 Compare March 17, 2025 13:00
giuseppe added a commit to giuseppe/libpod that referenced this pull request Mar 17, 2025
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe giuseppe force-pushed the store-images-without-mappings branch from 6287869 to a0412cc Compare March 17, 2025 13:41
giuseppe added a commit to giuseppe/libpod that referenced this pull request Mar 17, 2025
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe giuseppe added the jira label Mar 17, 2025
@giuseppe
Copy link
Member Author

so this seems to fix the problem: containers/podman#25606

@Luap99

@giuseppe giuseppe marked this pull request as ready for review March 17, 2025 16:03
@giuseppe giuseppe changed the title [RFC] store: store images always in canonical format store: store images always in canonical format Mar 17, 2025
Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand ~nothing about the problem space.

With idmapped mounts, there is no extra cost to change the IDs at runtime when using overlay.

This code has no condition for “using overlay”, so I don’t know how that relates. Doesn’t it break the other graph drivers?

This allows the image to be mounted directly with VFS, avoiding any mismatch that could occur when it was previously created with an explicit ID mapping.

AFAICS for non-writeable == image layers, uidMap, gidMap come from the options passed by c/image; and c/image passes nothing, so the existing condition is ~equivalent to Driver.SupportsShifting. That sounds, intuitively, about right? (See above about “using overlay”).

Also, assuming that uidMap/gidMap are always nil during pulls, the two if cases only differ in Host?IDMapping fields … and those fields are not used anywhere in the callees, AFAICS. So how does this make a difference? [Reusing the same types for different purposes is a mess…]

So… I’m confused. I appreciate that there are test PRs that suggest this does something, but just looking at this PR, I can’t see what, or how it possibly could.

(If other reviewers understand what is going on, that’s fine, this does not need to block on me.)

@giuseppe
Copy link
Member Author

I understand ~nothing about the problem space.

With idmapped mounts, there is no extra cost to change the IDs at runtime when using overlay.

This code has no condition for “using overlay”, so I don’t know how that relates. Doesn’t it break the other graph drivers?

This allows the image to be mounted directly with VFS, avoiding any mismatch that could occur when it was previously created with an explicit ID mapping.

AFAICS for non-writeable == image layers, uidMap, gidMap come from the options passed by c/image; and c/image passes nothing, so the existing condition is ~equivalent to Driver.SupportsShifting. That sounds, intuitively, about right? (See above about “using overlay”).

Also, assuming that uidMap/gidMap are always nil during pulls, the two if cases only differ in Host?IDMapping fields … and those fields are not used anywhere in the callees, AFAICS. So how does this make a difference? [Reusing the same types for different purposes is a mess…]

So… I’m confused. I appreciate that there are test PRs that suggest this does something, but just looking at this PR, I can’t see what, or how it possibly could.

(If other reviewers understand what is going on, that’s fine, this does not need to block on me.)

the reason why we have this code is that if you run something like:

podman run --uidmap A:B:C ... IMAGE

and the IMAGE is not present in the local storage, it is pulled directly with the specified mapping. In this way we don't need to create another copy of the image before running the container.

This is not an issue anymore with overlay, since with idmapped mounts we already store the "canonical" image, since s.canUseShifting(uidMap, gidMap) returns true for overlay.

With VFS it does not matter, because for the container we always need to create a full copy anyway.

The issue we had in Podman is that if you run something like:

podman --storage-driver vfs run --uidmap A:B:C ... IMAGE
podman --storage-driver vfs image mount IMAGE

we would end up mounting the image with the mapping A:B:C, which is of course wrong. An alternative solution would be to create a copy of the image at MountImage() time if there is not already a copy with the "canonical" mapping.

I've picked this alternative because it is easier to implement as we already have the logic to handle it, and because I don't see the value in doing this anymore, since the driver we care the most about (overlay) can do the mapping at runtime without any additional setup cost.

@mtrmac
Copy link
Collaborator

mtrmac commented Mar 17, 2025

Oh, I see what I missed:

if !options.HostUIDMapping && len(options.UIDMap) == 0 {
			uidMap = s.uidMap
		}

in putLayer comes from the Podman CLI, without c/image involvement.

@Luap99
Copy link
Member

Luap99 commented Mar 17, 2025

This is not an issue anymore with overlay, since with idmapped mounts we already store the "canonical" image, since s.canUseShifting(uidMap, gidMap) returns true for overlay.

Is that true for rootless? I thought rootless can never use idmapped mounts (unless we use fuse-overlayfs). As such this optimization would be lost for rootless pulls then?
Not sure if that matters because many ways how the images are pulled may not include the idmapped options via podman anyway (e.g. podman pull or the system service)

@giuseppe
Copy link
Member Author

Is that true for rootless? I thought rootless can never use idmapped mounts (unless we use fuse-overlayfs). As such this optimization would be lost for rootless pulls then?
Not sure if that matters because many ways how the images are pulled may not include the idmapped options via podman anyway (e.g. podman pull or the system service)

that is a good point! No that is not true for rootless and it might impact toolbx

@mtrmac
Copy link
Collaborator

mtrmac commented Mar 17, 2025

The issue we had in Podman is that if you run something like:

podman --storage-driver vfs run --uidmap A:B:C ... IMAGE
podman --storage-driver vfs image mount IMAGE

we would end up mounting the image with the mapping A:B:C, which is of course wrong.

So, approximately, MountImage is missing something like imageTopLayerForMapping?


The current approach ~relies on podman image mount not having an --uidmap option, so that we store layers in the format which podman image mount prefers.

But what about Store.Mount(containerID)? If different containers can have different ID maps, that effectively allows doing … mount --uidmap $arbitrary; so, from just locally reading the code, shouldn’t we be supporting the general case, with arbitrary mappings, anyway?

And I see a Podman caller of Mount(containerID).


I've picked this alternative because it is easier to implement as we already have the logic to handle it, and because I don't see the value in doing this anymore, since the driver we care the most about (overlay) can do the mapping at runtime without any additional setup cost.

The issue I have with this (if I understand things, which is unlikely) is that it is extremely indirect and invisible to readers of the code.

  • This assumes we only have graph drivers like overlay (with trivially cheap ID mapping) or VFS (which do a full copy anyway). What about all the other graph drivers, then? That’s Btrfs, ZFS, AUFS, Windows. I mean, if we absolutely don’t care whether they work, we should just delete them right now.
  • Even for overlay, Driver.SupportsShifting has ~4 conditions that need to be true. What about all of the other cases? Should we delete all of that code, and refuse to start?
  • We have PutLayer nominally accepting ID mapping options… and it’s going to just ignore them because it assumes that they don’t matter. “How does c/storage.Store know?”; at this API layer, it seems to be possible for a caller to do PutLayer(nonDefaultIDMapping); Mount(), and the caller would be legitimate in expecting the ID mapping to be processed.

I’m all for simplifying things and not processing irrelevant options on unreachable code paths. But then the code paths should be recognizably unreachable. E.g. deprecate the public LayerOptions.*IDMap*, replace internal data structures with smaller better-focused structs that only carry the relevant fields, etc.

I suppose that’s a lot of work, quite possibly more work than making the top-level mapping work as expected for this use case.

But then again, I understand very little about all this code.

@mtrmac
Copy link
Collaborator

mtrmac commented Mar 17, 2025

What about all the other graph drivers, then?

… and if either of the graph drivers decided to change these features (maybe VFS grew some kind of “fast copy” that required the UID maps to match), how would anyone notice that this putLayer needs updating? That’s what I mean by “invisible”.

@giuseppe
Copy link
Member Author

  • This assumes we only have graph drivers like overlay (with trivially cheap ID mapping) or VFS (which do a full copy anyway). What about all the other graph drivers, then? That’s Btrfs, ZFS, AUFS, Windows. I mean, if we absolutely don’t care whether they work, we should just delete them right now.

I don't know much about these drivers. If changes in c/storage would depend on these drivers we would just stop doing any.

Your analysis is correct, with containers we use imageTopLayerForMapping but then there is a fallback when the image is in an additional image store, and we can't do a copy. So the remapping is done in the container layer (https://github.com/containers/storage/blob/main/store.go#L1742-L1747). With images we can't do that, since we don't create a new layer, so I am not sure how we can deal with images in an additional image store

@mtrmac
Copy link
Collaborator

mtrmac commented Mar 17, 2025

then there is a fallback when the image is in an additional image store, and we can't do a copy.

Interesting. Making a (potentially very expensive) copy in the primary store on a “mount” would be unexpected… and would we try to delete it again on unmount?


But what about Store.Mount(containerID)? If different containers can have different ID maps, that effectively allows doing … mount --uidmap $arbitrary

On second thought, that should be fine, because we would be mounting not an image layer, but the container’s layer, which is already ID-shifted, isn’t it?


I think the VFS situation can be modeled ~precisely by:

  • Adding a Driver property saying what is special about VFS, ContainerCreationAlwaysExpensivelyCopiesLayer — and with that property, never shifting image layers
  • Checking the ID mapping options on all other APIs (to do: enumerate), and failing if they specify any other ID mapping when using image layers.

And we would let the other graph drivers do what they have always done, blindly hoping that it is correct.

But I don’t see how this is desirable for the overlay cases where shifting is not available.

@giuseppe
Copy link
Member Author

I think the VFS situation can be modeled ~precisely by:

  • Adding a Driver property saying what is special about VFS, ContainerCreationAlwaysExpensivelyCopiesLayer — and with that property, never shifting image layers
  • Checking the ID mapping options on all other APIs (to do: enumerate), and failing if they specify any other ID mapping when using image layers.

And we would let the other graph drivers do what they have always done, blindly hoping that it is correct.

But I don’t see how this is desirable for the overlay cases where shifting is not available.

I don't think we want to keep the current behavior in any case, an optimization makes sense only when the result is correct. If I run podman image mount fedora, I expect the files to be owned by root, not some random ID that was used for a user namespace. That is completely unexpected.

@mtrmac
Copy link
Collaborator

mtrmac commented Mar 17, 2025

I agree “all image layers are always using the default mapping” is a reasonable and consistent design, in general. Now that we have the other one, switching could potentially have performance implications, by requiring the ID-mapped container layer to be now expensively created. I don’t know how much we care and I don’t know how to tell.

(Does not doing any mapping for image layers work for rootless? I guess it does because everything happens inside a namespace, but just to confirm.)

@giuseppe giuseppe force-pushed the store-images-without-mappings branch from a0412cc to c7cf6cf Compare March 18, 2025 00:00
@giuseppe giuseppe changed the title store: store images always in canonical format store: map image when mounting if needed Mar 18, 2025
@giuseppe giuseppe force-pushed the store-images-without-mappings branch 2 times, most recently from 5097732 to 3c6a71a Compare March 18, 2025 07:54
@giuseppe giuseppe changed the title store: map image when mounting if needed store: use canonical image mapping when mounting Mar 18, 2025
@giuseppe
Copy link
Member Author

pushed a new version, now a canonical mapping is created by MountImage if one doesn't already exist. It fails if the image exists only in an additional image store without the canonical mapping, but I think that is a weird configuration anyway.

@Luap99
Copy link
Member

Luap99 commented Mar 18, 2025

pushed a new version, now a canonical mapping is created by MountImage if one doesn't already exist. It fails if the image exists only in an additional image store without the canonical mapping, but I think that is a weird configuration anyway.

Sounds reasonable to me, can you please test this version again in podman to make sure there is no new regression.

giuseppe added a commit to giuseppe/libpod that referenced this pull request Mar 18, 2025
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@mtrmac
Copy link
Collaborator

mtrmac commented Mar 20, 2025

EDIT: we can do that directly in Podman without adding another API.

I’d much rather add a specialized MountImageIsAlwaysCheap API, or something like that.

One reason is the general design aspect: it’s better to make precisely the promise the caller cares about, and not expose implementation details that affect that promise but don’t exactly match, where we wouldn’t notice when changing c/storage that the Podman CLI also needs updating.

An equally important reason is that we have all these conversations about redesigning the layer store backing (bootc), and redesigning layer identities (sha512); both almost certainly require changes to the driver API, so I’m thinking we should make it internal-only and deprecate the public store.GraphDriver() entirely. Last time I’ve looked, there’s ~1 user in each of Podman, c/common, CRI-O. So, I’d prefer not adding any more users.

@giuseppe giuseppe force-pushed the store-images-without-mappings branch from 3bcc709 to b8e65cf Compare March 20, 2025 18:18
@giuseppe
Copy link
Member Author

@mtrmac fine to merge?

@giuseppe
Copy link
Member Author

@nalind ok to merge (together with containers/common#2374)?

store.go Outdated
Comment on lines 2849 to 2850
var imageHomeStore roImageStore // Set if image != ""
var cimage *Image // Set if image != ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Set if image != "" comments are not applicable here.

(Non-blocking nit: The variables could be moved immediately before their first use (or, in case of cimage, use :=`).)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary thing I’d like to see fixed here is removing the incorrect/inapplicable comments.

@giuseppe giuseppe force-pushed the store-images-without-mappings branch from b8e65cf to 997b284 Compare March 24, 2025 18:26
for _, layerID := range img.MappedTopLayers {
l, err := rlstore.Get(layerID)
if err != nil {
return false, err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous version silently ignored ErrLayerUnknown. Is the change intentional?

I don’t have much of an opinion, failing immediately could be argued to be better, I just want to avoid accidents.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before we were running without keeping the lock, so that error was expected. Now, I am not sure we should ignore it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good point.

We are back to loading the image, unlocking, and then looking at layers, so we might again hit a race here. Should this go back to the previous behavior?

@giuseppe giuseppe force-pushed the store-images-without-mappings branch from 997b284 to c79c89f Compare March 24, 2025 21:05
@giuseppe giuseppe marked this pull request as draft March 24, 2025 21:41
@giuseppe giuseppe force-pushed the store-images-without-mappings branch from c79c89f to 13b082b Compare March 24, 2025 21:45
@giuseppe giuseppe force-pushed the store-images-without-mappings branch 2 times, most recently from b315a18 to 0b13549 Compare March 24, 2025 22:04
@giuseppe giuseppe marked this pull request as ready for review March 24, 2025 22:34
@giuseppe
Copy link
Member Author

podman tests pass with the last version

the vfs driver does not create any mount, so it doesn't accept any
argument passed to MountImage.  Drop the arguments so mount image can
be used in the tests suite.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe giuseppe force-pushed the store-images-without-mappings branch from 0b13549 to c606848 Compare March 25, 2025 14:49
Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Please consider #2285 (comment) , and #2285 (comment) .

And after looking at that, feel free to merge without another review.

Ensure that a mapped layer with the canonical mapping is created if
one doesn't already exist.  Previously, any layer was used regardless
of its mapping, which resulted in the mounted image having incorrect
file ownership, depending on what mapping was found.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe giuseppe force-pushed the store-images-without-mappings branch from c606848 to 360c002 Compare March 25, 2025 16:32
@giuseppe
Copy link
Member Author

Please consider #2285 (comment) , and #2285 (comment) .

And after looking at that, feel free to merge without another review.

thanks, pushed a new version. I'll merge once the CI is green.

Fine to merge containers/common#2374 too?

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Contributor

openshift-ci bot commented Mar 25, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe, mtrmac

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 4d1ae4a into containers:main Mar 25, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants