You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I do not know how exactly it happened, but I got into a situation when the snapshot controller was processing a VolumeGroupSnapshot and corresponding VolumeSnapshot already existed. The controller reacts to 409 Already Exists here:
You can see the code continues processing when the snapshot already exists. However, the createdVolumeSnapshot variable has undefined content (it points to an object with empty namespace / name in my case) and later on, when the controller tries to update it, Patch() fails here:
Error: E0310 21:18:31.250917 1 groupsnapshot_controller_helper.go:257] could not sync group snapshot "e2e-volumegroupsnapshottable-8755/group-snapshot-z6w49": createSnapshotsForGroupSnapshotContent: binding volumesnapshot to volumesnapshotcontent resource name may not be empty
What you expected to happen:
The controller should perhaps Get() the VolumeSnapshot from its informer before creating it. And if it already exists, then fail + expect the VolumeSnapshot appears in the informer in the next retry.
All IsAlreadyExists should be then handled in a similar way, not just VolumeSnapshot.
How to reproduce it:
I don't know. It happened only once in OpenShift e2e tests. I am stress-testing the test, 1024 runs so far, 0 failures.
Logs from OpenShift e2e tests:
snapshot-controller (look for binding volumesnapshot to volumesnapshotcontent resource name may not be empty)
e2e test (look for VolumeSnapshot group-snapshot-z6w49 found but is not ready)
As you can see, it would be helpful to log names of created VolumeSnapshots, VolumeSnapshotContents, and VolumeGroupContents - it's quite hard to map which VolumeSnapshot failed to be patched and what's the corresponding VGSC.
Anything else we need to know?:
Environment:
Driver version: csi-driver-hostpath + e2e tests as in Kubernetes 1.32
Kubernetes version (use kubectl version): 1.32-ish
The text was updated successfully, but these errors were encountered:
What happened:
I do not know how exactly it happened, but I got into a situation when the snapshot controller was processing a VolumeGroupSnapshot and corresponding VolumeSnapshot already existed. The controller reacts to
409 Already Exists
here:external-snapshotter/pkg/common-controller/groupsnapshot_controller_helper.go
Lines 626 to 630 in 59d7297
You can see the code continues processing when the snapshot already exists. However, the
createdVolumeSnapshot
variable has undefined content (it points to an object with empty namespace / name in my case) and later on, when the controller tries to update it,Patch()
fails here:external-snapshotter/pkg/common-controller/groupsnapshot_controller_helper.go
Line 649 in 59d7297
Error:
E0310 21:18:31.250917 1 groupsnapshot_controller_helper.go:257] could not sync group snapshot "e2e-volumegroupsnapshottable-8755/group-snapshot-z6w49": createSnapshotsForGroupSnapshotContent: binding volumesnapshot to volumesnapshotcontent resource name may not be empty
What you expected to happen:
The controller should perhaps
Get()
the VolumeSnapshot from its informer before creating it. And if it already exists, then fail + expect the VolumeSnapshot appears in the informer in the next retry.All
IsAlreadyExists
should be then handled in a similar way, not just VolumeSnapshot.How to reproduce it:
I don't know. It happened only once in OpenShift e2e tests. I am stress-testing the test,
1024 runs so far, 0 failures
.Logs from OpenShift e2e tests:
binding volumesnapshot to volumesnapshotcontent resource name may not be empty
)VolumeSnapshot group-snapshot-z6w49 found but is not ready
)As you can see, it would be helpful to log names of created VolumeSnapshots, VolumeSnapshotContents, and VolumeGroupContents - it's quite hard to map which VolumeSnapshot failed to be patched and what's the corresponding VGSC.
Anything else we need to know?:
Environment:
kubectl version
): 1.32-ishThe text was updated successfully, but these errors were encountered: