-
Notifications
You must be signed in to change notification settings - Fork 149
feat: support volume cloning #1278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support volume cloning #1278
Conversation
Hi @umagnus. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
8071cef
to
24d8940
Compare
/retest |
pkg/azurefile/azurefile.go
Outdated
klog.V(2).Infof("copy fileshare %s to %s", srcFileShareName, dstFileShareName) | ||
out, copyErr = exec.Command("azcopy", "copy", srcPath, dstPath, "--recursive", "--check-length=false").CombinedOutput() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a note to docs that it needs azcopy and thus physically copies data? Maybe with a small table about how fast it could be - if I have a volume with 1TB of data (in say 100 MB files), how long will CreateVolume
take?
pkg/azurefile/controllerserver.go
Outdated
if err := d.accountSearchCache.Delete(lockKey); err != nil { | ||
return nil, status.Errorf(codes.Internal, err.Error()) | ||
if req.GetVolumeContentSource() != nil { | ||
if err := d.copyVolume(ctx, req, accountKey, shareOptions, storageEndpointSuffix, secret); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be blind, where is the destination share actually created? And how do you make sure it's deleted when azcopy
permanently fails and you want to return a final error to the external-provisioner? Because the external-provisioner may not call CreateVolume again if user has deleted their PVC, as the volume is not needed any longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The destination share is created in func CopyFileShare
in azurefile.go
by azcopy
. If azcopy
fails, it will return error in func copyVolume
and func CreateVolume
will return a final error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, but where is the share deleted after failed CopyFileShare
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the new file share would be created by azcopy
, and if azcopy returns failure, it does not makes sense to delete the file share explicitly by the csi driver since the csi driver would retry volume cloning after the failure within a period of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That means that the driver leaks a volume when user deletes corresponding pvc. I understand the interface between the provisioner and the driver does not allow much options here, perhaps we can enhance the protocol if you have a smart idea.
pkg/azurefileplugin/Dockerfile
Outdated
@@ -21,6 +21,11 @@ COPY ${binary} /azurefileplugin | |||
RUN apk upgrade --available --no-cache && \ | |||
apk add --no-cache ca-certificates cifs-utils util-linux e2fsprogs-extra udev xfsprogs-extra nfs-utils | |||
|
|||
RUN apk add gcompat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment for this package install
pkg/azurefileplugin/Dockerfile
Outdated
@@ -21,6 +21,11 @@ COPY ${binary} /azurefileplugin | |||
RUN apk upgrade --available --no-cache && \ | |||
apk add --no-cache ca-certificates cifs-utils util-linux e2fsprogs-extra udev xfsprogs-extra nfs-utils | |||
|
|||
RUN apk add gcompat | |||
RUN wget -c https://azcopyvnext.azureedge.net/release20230420/azcopy_linux_amd64_10.18.1.tar.gz && tar -zxvf azcopy_linux_amd64_10.18.1.tar.gz && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there an arm64 support for this binary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is not supported because azcopy
has another arm64 preview binary (https://aka.ms/downloadazcopy-v10-linux-arm64).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then copy arm64 binary if it's arm64 image
e883746
to
e132808
Compare
a99867c
to
ba0b890
Compare
b75bb56
to
7e278a7
Compare
/retest |
1 similar comment
/retest |
d850744
to
2e666d4
Compare
6003e6c
to
2454d9e
Compare
d8c53c8
to
6829f5d
Compare
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in L526, it's
mc := metrics.NewMetricContext(azureFileCSIDriverName, "controller_create_volume", d.cloud.ResourceGroup, subsID, d.Name)
let's make it separate, for volume cloning, use controller_create_volume_from_volume
, and for snapshot restore, use controller_create_volume_from_snapshot
in the future.
pkg/azurefile/controllerserver.go
Outdated
// logging the job status if it's volume cloning | ||
if req.GetVolumeContentSource() != nil { | ||
jobState, percent, err := getAzcopyJob(volName) | ||
klog.V(2).Infof("azcopy job status: %s, copy percent: %s%, error: %v", jobState, percent, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy percent: %.1f%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
percent
is a string parsed by azcopy and it is already %1.f format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use copy percent: %s%%
to fix print nil error
pkg/azurefile/azurefile.go
Outdated
select { | ||
case <-timeTick: | ||
jobState, percent, err := getAzcopyJob(dstFileShareName) | ||
klog.V(2).Infof("azcopy job status: %s, copy percent: %s%, error: %v", jobState, percent, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy percent: %.1f%
pkg/azurefile/azurefile.go
Outdated
dstPath := fmt.Sprintf("https://%s.file.%s/%s%s", accountName, storageEndpointSuffix, dstFileShareName, accountSasToken) | ||
|
||
jobState, percent, err := getAzcopyJob(dstFileShareName) | ||
klog.V(2).Infof("azcopy job status: %s, copy percent: %s%, error: %v", jobState, percent, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy percent: %.1f%
c4bfab7
to
20188a7
Compare
20188a7
to
b71e088
Compare
/retest |
2 similar comments
/retest |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andyzhangx, umagnus The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cherrypick release-1.29 |
@andyzhangx: new pull request created: #1462 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherrypick release-1.28 |
@andyzhangx: #1278 failed to apply on top of branch "release-1.28":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
feat: support volume cloning in Azure File CSI driver
Which issue(s) this PR fixes:
Fixes #
Requirements:
Special notes for your reviewer:
The Alpine distro is based on a libc variant called musl-libc.
azcopy
is compiled against glibc, so it will getno such file or directory: unknown
error. To resolve this issue, I add glibc to alpine distro by usingapk add gcompat
.Reference:
csi copy percent logs:
azcopy copy speed
8 connections, cpu and memory used:
POD NAME CPU(cores) MEMORY(bytes)
csi-azurefile-controller-5675f96dc9-rs4zh azurefile 60m 66Mi
cpu: around 50-70, memory: around 60-70
Release note: