Skip to content

Accommodate VGS workflows in PVC CSI plugin #9019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shubham-pampattiwar
Copy link
Collaborator

Thank you for contributing to Velero!

Please add a summary of your change

PR implements support for VolumeGroupSnapshot (VGS) in the CSI PVC plugin with handling for three scenarios:

  • PVC with VGS label and no existing VS: selects appropriate VGSClass, creates VGS, waits for VS status update, labels and cleans up VS, patches VGSC to Retain, deletes VGS and VGSC.
  • PVC with VGS label and existing VS: reuses VS if backup labels and VGS status match.
  • PVC without VGS label: follows legacy individual VS workflow.

Applies common branching for datamover and non-datamover workflows based on the resolved VS. Adds unit tests for VGS-based logic

Does your change fix a particular issue?

Fixes #8936

Related to sub-task of #8865

Please indicate you've done the following:

Copy link

codecov bot commented Jun 12, 2025

Codecov Report

Attention: Patch coverage is 61.85286% with 140 lines in your changes missing coverage. Please review.

Project coverage is 60.07%. Comparing base (5b29a87) to head (ab34939).

Files with missing lines Patch % Lines
pkg/backup/actions/csi/pvc_action.go 63.58% 108 Missing and 22 partials ⚠️
pkg/client/factory.go 0.00% 4 Missing and 2 partials ⚠️
pkg/cmd/server/server.go 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9019      +/-   ##
==========================================
+ Coverage   60.05%   60.07%   +0.02%     
==========================================
  Files         378      378              
  Lines       42883    43249     +366     
==========================================
+ Hits        25754    25983     +229     
- Misses      15582    15696     +114     
- Partials     1547     1570      +23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@shubham-pampattiwar shubham-pampattiwar force-pushed the vgs-task-3 branch 3 times, most recently from 65c1223 to c388581 Compare June 16, 2025 21:02
@github-actions github-actions bot added the Dependencies Pull requests that update a dependency file label Jun 16, 2025
@shubham-pampattiwar shubham-pampattiwar force-pushed the vgs-task-3 branch 2 times, most recently from 88bbb2f to 7499884 Compare June 16, 2025 21:24
@shubham-pampattiwar shubham-pampattiwar force-pushed the vgs-task-3 branch 5 times, most recently from 1d83920 to b2e5096 Compare June 27, 2025 19:56
@shubham-pampattiwar shubham-pampattiwar marked this pull request as ready for review June 27, 2025 20:50
@kaovilai
Copy link
Collaborator

kaovilai commented Jul 1, 2025

👁️

kaovilai
kaovilai previously approved these changes Jul 2, 2025
Copy link
Collaborator

@kaovilai kaovilai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VGS (VolumeGroupSnapshot) Workflow Diagram

This diagram illustrates the VGS workflow implementation in the PVC CSI plugin (pkg/backup/actions/csi/pvc_action.go).

flowchart TB
    Start([PVC Backup Item Action Execute]) --> ValidateBackup{Validate Backup Phase}
    
    ValidateBackup -->|Invalid Phase| End1[Return without modification]
    ValidateBackup -->|Valid| ValidatePVC{Validate PVC and PV}
    
    ValidatePVC -->|No Storage Class| Error1[Error: No Storage Class]
    ValidatePVC -->|Not CSI Volume| SkipPVC[Mark as Skipped<br/>Add SkippedNoCSIPVAnnotation]
    ValidatePVC -->|Valid CSI Volume| CheckSnapshot{Should Perform Snapshot?}
    
    SkipPVC --> End1
    Error1 --> End1
    
    CheckSnapshot -->|No| End1
    CheckSnapshot -->|Yes| GetVSRef[getVolumeSnapshotReference]
    
    GetVSRef --> CheckVGSLabel{Check VGS Label<br/>backup.Spec.VolumeGroupSnapshotLabelKey}
    
    CheckVGSLabel -->|No Label or Empty| LegacyPath[Legacy: createVolumeSnapshot]
    CheckVGSLabel -->|Has Label with Group Value| VGSPath[VGS Workflow Path]
    
    %% Legacy Path
    LegacyPath --> GetStorageClass[Get Storage Class]
    GetStorageClass --> GetVSClass[Get VolumeSnapshotClass]
    GetVSClass --> CreateVS[Create Individual VolumeSnapshot]
    CreateVS --> VSCreated[VS Created]
    
    %% VGS Path
    VGSPath --> FindExistingVS{Find Existing VS<br/>for Current Backup}
    
    FindExistingVS -->|Found with VGS Name| ReuseVS[Reuse Existing VS]
    FindExistingVS -->|Not Found| CreateVGSFlow[Start VGS Creation Flow]
    
    CreateVGSFlow --> ListGroupedPVCs[List All PVCs<br/>with Same Group Label]
    ListGroupedPVCs --> DetermineDriver{Determine CSI Driver<br/>for Grouped PVCs}
    
    DetermineDriver -->|Multiple Drivers| Error2[Error: Multiple CSI Drivers]
    DetermineDriver -->|Single Driver| DetermineVGSClass[Determine VGS Class<br/>1. PVC Annotation<br/>2. Backup Annotation<br/>3. Default Label]
    
    Error2 --> End1
    
    DetermineVGSClass --> CreateVGS[Create VolumeGroupSnapshot<br/>with Label Selector]
    CreateVGS --> WaitVSStatus[Wait for VS Objects<br/>to have Status and VGS Name]
    
    WaitVSStatus --> UpdateVS[Update VS Objects:<br/>- Remove VGS Owner Ref<br/>- Remove Finalizers<br/>- Add Backup Labels]
    
    UpdateVS --> WaitVGSC[Wait for VGSC Binding<br/>in VGS Status]
    
    WaitVGSC --> PatchVGSC[Patch VGSC<br/>DeletionPolicy to Retain]
    
    PatchVGSC --> DeleteResources[Delete VGS and VGSC]
    
    DeleteResources --> GetVSForPVC[Get VS for Current PVC<br/>Match by:<br/>- PVC Name<br/>- VGS Name<br/>- Backup Labels]
    
    GetVSForPVC --> VSReady[VS Ready for Use]
    ReuseVS --> VSReady
    VSCreated --> VSReady
    
    VSReady --> CheckDataMover{Check SnapshotMoveData}
    
    CheckDataMover -->|True| DataMoverPath[Data Mover Workflow]
    CheckDataMover -->|False| StandardPath[Standard Backup Path]
    
    DataMoverPath --> WaitVSCHandle[Wait for VSC Handle]
    WaitVSCHandle --> CreateDataUpload[Create DataUpload CR]
    CreateDataUpload -->|Success| AddDataUploadAnnotation[Add DataUpload Annotation]
    CreateDataUpload -->|Failure| CleanupVS[Cleanup VS and Log Error]
    
    CleanupVS --> End1
    AddDataUploadAnnotation --> UpdatePVC[Update PVC with Labels/Annotations]
    
    StandardPath --> AddVSToBackup[Add VS to Additional Items]
    AddVSToBackup --> UpdatePVC
    
    UpdatePVC --> ReturnResult[Return Updated PVC<br/>with Additional Items]
    ReturnResult --> End2([End])

    %% Style classes
    classDef errorNode fill:#ff6b6b,stroke:#c92a2a,color:#fff
    classDef successNode fill:#51cf66,stroke:#2f9e44,color:#fff
    classDef decisionNode fill:#ffd43b,stroke:#fab005,color:#000
    classDef processNode fill:#74c0fc,stroke:#339af0,color:#000
    classDef vgsNode fill:#ff8cc8,stroke:#f06595,color:#000
    
    class Error1,Error2,CleanupVS errorNode
    class VSCreated,VSReady,ReturnResult successNode
    class ValidateBackup,ValidatePVC,CheckSnapshot,CheckVGSLabel,FindExistingVS,DetermineDriver,CheckDataMover decisionNode
    class GetVSRef,LegacyPath,VGSPath,CreateVGSFlow,ListGroupedPVCs,DetermineVGSClass,CreateVGS,WaitVSStatus,UpdateVS,WaitVGSC,PatchVGSC,DeleteResources,GetVSForPVC processNode
    class CreateVGS,WaitVSStatus,UpdateVS,WaitVGSC,PatchVGSC,DeleteResources vgsNode
Loading

Key Components and Flow

1. Main Workflow Branches

  • VGS Path: When PVC has VGS label (defined by backup.Spec.VolumeGroupSnapshotLabelKey)
  • Legacy Path: When PVC doesn't have VGS label - creates individual VolumeSnapshot

2. VGS Creation Process

  1. Check for Existing VS: First checks if a VS already exists for this PVC in the current backup
  2. Group PVC Discovery: Lists all PVCs with the same group label
  3. Driver Validation: Ensures all grouped PVCs use the same CSI driver
  4. VGS Class Selection: Determines appropriate VolumeGroupSnapshotClass using hierarchy:
    • PVC-level annotation override
    • Backup-level annotation override
    • Default class with label
  5. VGS Creation: Creates VolumeGroupSnapshot with label selector
  6. VS Status Wait: Waits for all VS objects to report VGS name in status
  7. VS Updates: Removes owner references, finalizers, adds backup labels
  8. VGSC Binding: Waits for VolumeGroupSnapshotContent binding
  9. Retention Policy: Patches VGSC to Retain deletion policy
  10. Cleanup: Deletes VGS and VGSC (VS objects remain)

3. Decision Points and Error Handling

  • Multiple CSI Drivers: Fails if grouped PVCs use different drivers
  • Missing VGS Class: Fails if no appropriate VGS class found
  • Timeout Handling: Uses backup's CSI snapshot timeout for all wait operations
  • VS Not Found: Errors if VS can't be found after VGS creation

4. Resource Relationships

  • VGS (VolumeGroupSnapshot): Groups multiple PVCs for atomic snapshot
  • VGSC (VolumeGroupSnapshotContent): Backing content for VGS
  • VS (VolumeSnapshot): Individual snapshots created by VGS
  • VSC (VolumeSnapshotContent): Backing content for each VS

5. Key Features

  • Idempotency: Reuses existing VS if found for current backup
  • Atomic Operations: All grouped PVCs snapshot together
  • Resource Cleanup: VGS/VGSC deleted after VS creation
  • Retention: VGSC patched to Retain before deletion
  • Label Propagation: Backup labels added to VS objects

@shubham-pampattiwar
Copy link
Collaborator Author

@Lyndon-Li @blackpiglet Updated the PR to address the review comments and feedback. PTAL, Thanks !

@blackpiglet
Copy link
Contributor

blackpiglet commented Jul 15, 2025

@shubham-pampattiwar
Thanks for the modification. The PR looks good to me.
Just a few minor comments.

@Lyndon-Li was on vacation this week.
Could we wait for his response before merging this PR?

Signed-off-by: Shubham Pampattiwar <[email protected]>

Add changelog file

Signed-off-by: Shubham Pampattiwar <[email protected]>

make update

Signed-off-by: Shubham Pampattiwar <[email protected]>

lint fix

Signed-off-by: Shubham Pampattiwar <[email protected]>

add unit tests for getVSForPVC func

Signed-off-by: Shubham Pampattiwar <[email protected]>

Use v1beta1 instead of v1 v1alpha1

Signed-off-by: Shubham Pampattiwar <[email protected]>

go mod tidy

Signed-off-by: Shubham Pampattiwar <[email protected]>

update updateVGSCreatedVS func to use retry on conflict

Signed-off-by: Shubham Pampattiwar <[email protected]>

make update minor fix

Signed-off-by: Shubham Pampattiwar <[email protected]>

fix ut assert

Signed-off-by: Shubham Pampattiwar <[email protected]>

Address PR feedback

Signed-off-by: Shubham Pampattiwar <[email protected]>

minor updates

Signed-off-by: Shubham Pampattiwar <[email protected]>
@shubham-pampattiwar
Copy link
Collaborator Author

@blackpiglet Thanks for the review. I have updated the PR addressing your latest feedback. PTAL !
Yes we can wait for @Lyndon-Li's review.

// Finalizer constants
const (
VolumeSnapshotFinalizerGroupProtection = "snapshot.storage.kubernetes.io/volumesnapshot-in-group-protection"
VolumeSnapshotFinalizerSourceProtection = "snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/kubernetes-csi/external-snapshotter/blob/6b2feaaf7fc3d6d0fe21029d1032ea0eee2081e9/pkg/utils/util.go#L86

Please note that we can refer to the external-snapshotter's definition instead of declaring it here again.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That util package isn’t published in the Go module path, so importing it directly isn’t possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~/experiments
❯ go build test.go 

~/experiments
❯ go run . 
Hello, snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection

~/experiments
❯ cat test.go 
package main

import (
        "fmt"

        "github.com/kubernetes-csi/external-snapshotter/v8/pkg/utils"
)

func main() {
        fmt.Printf("Hello, %s\n", utils.VolumeSnapshotAsSourceFinalizer)
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dependencies Pull requests that update a dependency file has-changelog has-unit-tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Updates to CSI PVC plugin
4 participants