Skip to content

When checking IsDangling make sure image is not in manifest list #2360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 26, 2025

Conversation

rhatdan
Copy link
Member

@rhatdan rhatdan commented Mar 11, 2025

Currently when we run podman image prune or podman images --filter
dangling

It is pruning images that are in a local manifest. These images are
not dangling because they are currently in use by a named manifest list.

You can create this situation simply by doing

echo "from scratch" > /tmp/Containerfile
id=$(podman build /tmp)
podman manifest create test $id
podman image prune --force
podman image exists $id

Will return an error since the image was pruned. Now the local manifest
test is broken.

Signed-off-by: Daniel J Walsh [email protected]<!--- Please read the contributing guidelines before proceeding --->

@rhatdan
Copy link
Member Author

rhatdan commented Mar 11, 2025

@nalind @Luap99 @mheon @mtrmac PTAL

@ericcurtin This is needed for OCI images in RamaLama. Currently if you convert a model to an OCI Image it creates an manifestlist pointing to the untagged built image, but if the user does a podman image prune the Model image is removed leaving a manifest list that can not be used. It will have a pointer to a sh256:DIGEST that no longer exists.

return false, err
}

for _, image := range allImages {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concerned by how expensive this is going to make the dangling check. It feels like there should be a better way of checking manifest list membership, but maybe that's a c/storage RFE for when we refactor there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the libimage/manifests code has, sadly?, been implemented here in c/common, in a way that makes it ~invisible to c/storage.


But, yes, this is not good at all. isDangling is called from ListImages. So:

  • With SetListData, this is an infinite recursion.
  • and, anyway, this is O(images^2) at least.

This needs to be implemented in a way where “all images” are only processed once (well, ideally it should be an O(1) query, but I don’t think we have that data), to build a data structure which allows isDangling to be executed in, at worst O(1) I/O if not O(1) CPU.

And then I suspect that removeRecursive might want to call this internal version, with the data structure pre-built, instead of the public IsDangling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree this is O(n^n) which should not be done, basically the code does a list images then filters (which calls isDangling() on each image which then should not in turn call list again all images for each image)

It should properly build a map of all images used in manifests and then pass that map around so the dangling filter can do a quick check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… Oops, not an infinite recursion, because the recursive calls don’t have SetListData set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right n^2 not n^n, my bad.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to comment the same above:

⚠️ listing all images is expensive, especially when doing that for potentially all images. That's why the code uses a layer tree to operate on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably faster to first compute a (reverse) mapping from manifest lists to images and just do look ups as Paul suggested.

return false, err
}

for _, image := range allImages {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the libimage/manifests code has, sadly?, been implemented here in c/common, in a way that makes it ~invisible to c/storage.


But, yes, this is not good at all. isDangling is called from ListImages. So:

  • With SetListData, this is an infinite recursion.
  • and, anyway, this is O(images^2) at least.

This needs to be implemented in a way where “all images” are only processed once (well, ideally it should be an O(1) query, but I don’t think we have that data), to build a data structure which allows isDangling to be executed in, at worst O(1) I/O if not O(1) CPU.

And then I suspect that removeRecursive might want to call this internal version, with the data structure pre-built, instead of the public IsDangling.

@mtrmac
Copy link
Contributor

mtrmac commented Mar 11, 2025

BTW I don’t understand the motivating situation (is there, you know, a reproducer?), I don’t much understand the libimage/manifests code (something to fix on my side, sure) — so the above is only pointing out specific issues, it’s not an endorsement of the approach in general.

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a more verbose commit message explaining the reasoning. I think the current behavior follows the man pages and docs. I'd love to see tests here and for Podman as well (along with man page changes) if we go forward.

@rhatdan
Copy link
Member Author

rhatdan commented Mar 12, 2025

I planned on adding the tests to podman.

The easiest test is to do pseudo code.

cat "from scratch" > /tmp/Containerfile
id = $(podman build /tmp)
podman manifest create NAME $id
podman image prune --force
podman image exist $id
podman manifest rm NAME
podman image prune --force
podman image exist $id

Last line should fail.

@mtrmac
Copy link
Contributor

mtrmac commented Mar 12, 2025

I think the current behavior follows the man pages and docs.

It might well be the case that we want the “dangling” filter (exposed in the UI) unchanged but podman rm updated. I don’t know. And I also don’t know where that leaves the public IsDangling API, it might go either way, or maybe we should replace it with something better-defined.

@rhatdan rhatdan force-pushed the dangling branch 2 times, most recently from 189032d to 00b7490 Compare March 12, 2025 18:05
Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick skim.

@@ -22,6 +23,8 @@ type layerTree struct {
// emptyImages do not have any top-layer so we cannot create a
// *layerNode for them.
emptyImages []*Image

manifestDigests map[digest.Digest]bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document keys/values. And if it is effectively a set, make the values struct{}.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you are after here. This is just a mapping of digests that are used in the manifest lists in containers/storage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mapping of what, to what?

Yes, you know that. I think I know that, now. Please write that down, to save future readers time so that they can decide just reading the data structure whether this data is relevant to them.

E.g. the keys are not the digests of all manifests of all images in the layer tree. Neither are the keys the digests of all manifest lists that exist in the storage (they are not digests of manifest lists at all ; also, for an ordinary podman pull, we typically store one per-platform manifest, and the parent manifest list, in c/storage, along with the pulled image, and those usually (?) have a top layer).

(And it turns out “to what” has the answer “always exactly true”, so it doesn’t need to be a boolean; a struct{} works just as well, when we are only after the presence of the key in the map.)

@@ -124,6 +128,15 @@ func (r *Runtime) newLayerTreeFromData(images []*Image, layers []storage.Layer)
topLayer := img.TopLayer()
if topLayer == "" {
tree.emptyImages = append(tree.emptyImages, img)
if manifestList, _ := img.IsManifestList(context.TODO()); manifestList {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to do this for every layer tree? Some (e.g. for Image.Parent) are one-time and one-purpose only.

(I don’t know whether this would be an option to layerTree, or a separate data structure parallel to layerTree.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want @vrothberg to comment on this. We can develop a different structure and process it separately or rename this structure to a different name. I read the tree as a place to cache processed data.

Copy link
Member

@vrothberg vrothberg Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment below:

Let's make sure to benchmark on a system with 100,200,400,800 images and if there is a noticeable difference.

I think it makes sense to only compute this data when needed and cache it. Using the layerTree for that purpose looks attractive as it's plumbed to cache/reset data. But the manifest digests could be queried via a function that either computes it fresh or returns the cached map?

@rhatdan rhatdan force-pushed the dangling branch 2 times, most recently from d534dbd to 4cdd485 Compare March 12, 2025 18:20
@rhatdan
Copy link
Member Author

rhatdan commented Mar 12, 2025

@nalind @Luap99 @mheon @mtrmac PTANL

I hopefully took all of your feedback. I will not open a [WIP] PR against podman to use this PR and write tests.

rhatdan added a commit to rhatdan/podman that referenced this pull request Mar 12, 2025
Vendor in latest containers/common

Currently if you create an un tagged image and add it to a manifest
list, podman image prune will remove the image, and leave you with
a broken manifest list. This PR removes the image from dangling in
this situation and prevents the pruning.

We have seen this trigger issues with RamaLama which is creating
manifest lists in this manner, and then users pruning the images.

Verifing: containers/common#2360

Signed-off-by: Daniel J Walsh <[email protected]>
@rhatdan
Copy link
Member Author

rhatdan commented Mar 12, 2025

containers/podman#25562

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit worried about the performance impacts. The tango for computing and operating on the layer tree addressed some serious bottlenecks for image operations.

The manifest-list checks may have a negative impact but I am unable to guess how much.

Let's make sure to benchmark on a system with 100,200,400,800 images and if there is a noticeable difference.

@@ -22,6 +23,8 @@ type layerTree struct {
// emptyImages do not have any top-layer so we cannot create a
// *layerNode for them.
emptyImages []*Image

manifestListDigests map[digest.Digest]bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhatdan, can you add a comment here describing map and how it should be used?

@@ -677,3 +678,12 @@ func (l *list) Instances() []digest.Digest {
}
return instances
}

func (l *list) Digests() []digest.Digest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment here as well to make sure callers know which digests are returned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok Adding comments, but I notice:

// Instances returns the list of image instances mentioned in this list.
func (l *list) Instances() []digest.Digest {
	instances := make([]digest.Digest, 0, len(l.oci.Manifests))
	for _, instance := range l.oci.Manifests {
		instances = append(instances, instance.Digest)
	}
	return instances
}// Digests returns list of all image digests referenced in the manifestList
func (l *list) Digests() []digest.Digest {
	digests := make([]digest.Digest, 0)	for i := range l.docker.Manifests {
		digests = append(digests, l.docker.Manifests[i].Digest)
	}
	return digests
}

I added the second function. Is there a difference between these two? IE a l.oci.Manifests and a l.docker.Manifests. Should my Digests function return both? Or should I just remove my DIgests and use Instances?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nalind pointed out to me that these fields are pretty much the same, so removed Digests and just use instances to get a list of digests.

@@ -124,6 +128,15 @@ func (r *Runtime) newLayerTreeFromData(images []*Image, layers []storage.Layer)
topLayer := img.TopLayer()
if topLayer == "" {
tree.emptyImages = append(tree.emptyImages, img)
if manifestList, _ := img.IsManifestList(ctx); manifestList {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment why we're doing this here.

@rhatdan
Copy link
Member Author

rhatdan commented Mar 13, 2025

@vrothberg added comments

// manifestList keep track of images based on their digest.
// Library will use this map when checking if a image is dangling.
// If an image is used in a manifestList it is NOT dangling
manifestListDigests map[digest.Digest]bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this just be map[digest.Digest]struct{}, @mtrmac comment this above already.
The bool needs an extra allocation AFAIK. With struct{} it is clear that is is a set and the value is ignored. With bool you might can set it to true or false so the meaning is not as obvious.

c/image has https://github.com/containers/image/blob/main/internal/set/set.go to abstract a set type, maybe we should make this public for more users if we are concerned about usability

Copy link
Contributor

@mtrmac mtrmac Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve been betting that the standard library will add a Set type Any Time Now, and that adding non-standard set implementations would add risk that they will appear in the public API and we will have to maintain the set implementation even as the standard library adds a different one.

That bet is looking increasingly dubious over time… maybe something will finally happen in golang/go#69230 .

*shrug* I won’t initiate this, but if someone wanted to add a set type to c/storage, I’d migrate c/image to it.

// When img is a manifest list, cache the lists of
// digests refereenced in manifest list. Digests can
// be used to check for dangling images.
if manifestList, _ := img.IsManifestList(ctx); manifestList {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also if we are concerned about performance impact why are we doing this for every layer tree.
The only caller that would care is ListImages(), but the other callers via newFreshLayerTree() do not need that so why should they pay the cost?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this as a huge cost with the new design, you already read the image into memory and I doubt their is a huge amount of manifestlist in anyone's environment. Not sure if hacking this up for a theoretically performance improvement is worth it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cost of storage operations is all the file-based locking (incl. writes to the lock files); it’s not purely in-memory.

But, also, I haven’t measured anything, and that’s the only answer that counts. (We don’t have a Podman performance regression test, do we?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would have to setup a test with a Huge amount of manifest list images to see any cost, and I am not sure if this is even hitting the disk at all, is this data already being collected in the list of images?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewImageSource[Get]Manifest is definitely doing locking and reading from files. That might be a cache-only operation, “only” costing syscalls…

… but newLayerTreeFromData does not, prior to this PR, read the manifests at all, so locally for this function reading manifests would be truly new I/O.

And then it depends on the caller — the parent/child queries do read manifests of likely images: so, listing parent/child/dangling data about every single image reads all manifests anyway; but, I think, one-image queries like Image.Parent seem to typically only read a very small subset.

@rhatdan rhatdan force-pushed the dangling branch 5 times, most recently from 35af08b to 2242b2e Compare March 18, 2025 13:53
rhatdan added a commit to rhatdan/podman that referenced this pull request Mar 18, 2025
Vendor in latest containers/common

Currently if you create an un tagged image and add it to a manifest
list, podman image prune will remove the image, and leave you with
a broken manifest list. This PR removes the image from dangling in
this situation and prevents the pruning.

We have seen this trigger issues with RamaLama which is creating
manifest lists in this manner, and then users pruning the images.

Verifing: containers/common#2360

Signed-off-by: Daniel J Walsh <[email protected]>
@rhatdan
Copy link
Member Author

rhatdan commented Mar 19, 2025

@Luap99 @vrothberg @mtrmac How about now?

Pulling containers/common into Podman is a HUGE mess right now.

rhatdan added a commit to rhatdan/podman that referenced this pull request Mar 20, 2025
Vendor in latest containers/common

Currently if you create an un tagged image and add it to a manifest
list, podman image prune will remove the image, and leave you with
a broken manifest list. This PR removes the image from dangling in
this situation and prevents the pruning.

We have seen this trigger issues with RamaLama which is creating
manifest lists in this manner, and then users pruning the images.

Verifing: containers/common#2360

Signed-off-by: Daniel J Walsh <[email protected]>
@@ -13,7 +14,7 @@ import (
// Tree generates a tree for the specified image and its layers. Use
// `traverseChildren` to traverse the layers of all children. By default, only
// layers of the image are printed.
func (i *Image) Tree(traverseChildren bool) (string, error) {
func (i *Image) Tree(ctx context.Context, traverseChildren bool) (string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That breaks the API, we don't care about it in c/common so that is fine but it will need an update in podman.
I think the other breaking changes are fixed with containers/podman#25635 so you shuld be good to vendor that without any other fixes in podman needed.

@@ -634,7 +634,7 @@ func (r *Runtime) ListImages(ctx context.Context, options *ListImagesOptions) ([

var tree *layerTree
if needsLayerTree {
tree, err = r.newLayerTreeFromData(images, snapshot.Layers)
tree, err = r.newLayerTreeFromData(ctx, images, snapshot.Layers, true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only run for options.SetListData which does not seem to be set in RemoveImages() which is called by prune so I don't think the true is correctly populated in that case to fix your issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prune calls listimages with a isdangling flag, which ends up calling here.

podman images --filter dangling also calls here.

Currently when we run podman image prune or podman images --filter
dangling

It is pruning images that are in a local manifest. These images are
not dangling because they are currently in use by a named manifest list.

You can create this situation simply by doing

echo "from scratch" > /tmp/Containerfile
id=$(podman build /tmp)
podman manifest create test $id
podman image prune --force
podman image exists $id

Will return an error since the image was pruned.  Now the local manifest
test is broken.

Signed-off-by: Daniel J Walsh <[email protected]>
rhatdan added a commit to rhatdan/podman that referenced this pull request Mar 21, 2025
Vendor in latest containers/common

Currently if you create an un tagged image and add it to a manifest
list, podman image prune will remove the image, and leave you with
a broken manifest list. This PR removes the image from dangling in
this situation and prevents the pruning.

We have seen this trigger issues with RamaLama which is creating
manifest lists in this manner, and then users pruning the images.

Verifing: containers/common#2360

Signed-off-by: Daniel J Walsh <[email protected]>
Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rhatdan
Copy link
Member Author

rhatdan commented Mar 24, 2025

@vrothberg @mtrmac PTANL

@vrothberg
Copy link
Member

I would love to see some benchmarks. If others feel comfortable merging, please don't block on me.

@rhatdan
Copy link
Member Author

rhatdan commented Mar 25, 2025

I will not be doing any benchmarking. Do not have the time, nor do I believe that this is a common enough case to worry about.

@mtrmac @mheon @nalind @baude @flouthoc PTAL

@flouthoc
Copy link
Collaborator

I saw a prior conversation about perf bottleneck but while reviewing but it seems bool generateManifestDigestList scopes changes only for ListImages(.

so PR LGTM.

I think we can have a separate card to write a benchmark script for various images operations then we can benchmark this again ?

@mtrmac PTAL.

Copy link
Collaborator

@flouthoc flouthoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

openshift-ci bot commented Mar 25, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: flouthoc, Luap99, rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mtrmac
Copy link
Contributor

mtrmac commented Mar 25, 2025

(I’m afraid I need to move forward with my assigned priorities, this will need to wait a bit if it is blocked on me. But it has 2 LGTMs already, that might be enough.)

@rhatdan rhatdan added the lgtm label Mar 26, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit d55c5a8 into containers:main Mar 26, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants