Skip to content

Signed URL resumable upload support #640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 10, 2017

Conversation

evildour
Copy link
Contributor

(It might help to read this section of docs for background)

The main thing needing review is the SignedUrlUploader class (which I'll comment and move to its own file if we decide to stick with it) and the way in which the user must create a the signed URL for the initial POST to kick off the resumable upload. I had a few choices here:

  1. Use the existing Sign methods and have a special http method named "RESUMABLE" that when specified, will set a special header needed for the upload and sign a URL for a POST method request instead. This is the route the gcloud SDK went with (see here). The main drawback here is its not easily discoverable without reading the remarks.
  2. Use the existing Sign methods and let the user specify the POST method and we'll automatically add the special header. This might require breaking changes if we ever want to support other kinds of POST requests in the future (currently no other POST requests are possible with signed URLs though).
  3. Use the existing Sign methods, let the user specify POST, and also require that they specify the special header manually. This seems like extra boilerplate we shouldn't require the users to perform.
  4. Create new methods such as SignResumable perhaps. This would require 4 additional methods though which are mostly identical to the current methods.

So I opted for 1, but I'm open to changing this.

Note that this requires a refactor of ResumableUpload<TRequest>, which I've split into a base ResumableUpload class and derived class. So @chrisdunelm, I've added you in case you want to look at those changes. However, keep in mind the refactor is kind of hacked together for now, but if we agree on this, I'll make the changes to google-api-dotnet-client in a much cleaner way. Once that's merged and published, I'll send out the real PR for resumable upload support.

@evildour evildour added the api: storage Issues related to the Cloud Storage API. label Dec 11, 2016
@evildour
Copy link
Contributor Author

Forgot to mention another issue: right now I have signed URLs returned as strings. But if you use the SignedUrlUploader to obtain a session URI, that's returned as a Uri instance because that's what the base logic in ResumableUpload uses. It seemed a little strange to "upwrap" it to return a string and then wrap it in a Uri again when it is passed back to ResumableUpload. So I just want to get anyone's thoughts on whether it seems ok to leave it like this or if we should be consistent and use all strings or all Uris. I'm leaning to all Uris if we do want to be consistent.

@evildour
Copy link
Contributor Author

(fixed the CI builds)

@jskeet
Copy link
Collaborator

jskeet commented Dec 12, 2016

Just a quick note - I'm not ignoring this; it's on my to-do list. I just have other more urgent things today. Will try to review tomorrow (or later today if it all works out).

@evildour
Copy link
Contributor Author

No rush

@jskeet
Copy link
Collaborator

jskeet commented Dec 13, 2016

Just to check, is the TmpMediaDownloader needed at all? I haven't spotted why it would be, but I could be missing something.

I'm having trouble understanding the ResumableUpload differences at the moment, but I'll have another look at those when I'm more awake.

@evildour
Copy link
Contributor Author

The TmpMediaDownloader was just added to quiet some compiler errors if I recall. As for ResumableUpload, like I said its kind of hacked for now but essentially its split into a base and derived class and there are two virtual members now (overridden in SignedUrlUploader). I don't think it needs a full review yet. That can wait until I implement the refactoring in the base library.

@evildour
Copy link
Contributor Author

Rebased and removed TmpMediaDownloader. It wasn't needed after all.

@jskeet
Copy link
Collaborator

jskeet commented Dec 13, 2016

Great. Will concentrate on just the signing bit then - thanks.

Copy link
Collaborator

@jskeet jskeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've had a look now, mostly at the SignedUrlUploader part. I think I need a little more context. In particular, I can see two use cases here:

  • My app wants to create a signed upload URL that another app can use to upload content.
  • My app wants to use a signed upload URL provided by another app to upload content.

In each case, the "other app" may or may not be in .NET, using this client library.

With your choices you laid out in the first comment, it's not clear whether this is even feasible, as it sounds like there may not be any standardized way of constructing the signed URL.

The regular resumable uploader goes through a "get a session URI, then use it" process, doesn't it? Is there any reason we need to expose a different process here? I may well be missing something - it feels like we should be able to just introduce a virtual method for "get session URI" and override that (and only that) in SignedUrlUploader. (In an ideal world, the signed URL would contain everything required, and we wouldn't need any extra code for this, possibly after a small change to the ResumableUpload code to allow for extra information.)

It could be worth us having a VC, as I suspect I may be missing several subtleties.

public static async Task<SignedUrlUploader> CreateFromSignedUrlAsync(
string signedUrl,
Stream contentStream,
ConfigurableHttpClient httpClient = null,

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@evildour
Copy link
Contributor Author

(VC might be best if this doesn't clear things up, and sorry for the 2nd wall of text)
Here are my thoughts on things (and it's possible I'm wrong about some or all of this). There are two ways an app with permissions (X) can grant an app without permissions (Y) temporary access to upload a large file:

  1. X creates a signed URL and sends it to Y. Y must perform the initial POST request to get the session URI and then perform PUT requests with that session URI to upload the file (ResumableUpload takes care of the PUTs).
  2. X creates a signed URL and performs the initial POST request to get the session URI, which it then sends to Y. Y then performs PUT requests with that session URI to upload the file. This is what I thought was described in the first paragraph here but as I was just rereading it now, I realized I missed the part about "avoiding...signing URLs" for this scenario. So this is all wrong.

So it looks like 1 is the only case I need to worry about here. However, due to my misunderstanding about 2, I have unintentionally created support for Y's side of things in the scenario described in the help there: X performs initial POST and sends session URI to Y. With this PR's code, Y can use SignedUrlUploader.CreateFromSessionUri. However, now it is clear than SignedUrlUploader is no longer an appropriate name for this class. Also, we may want to expose support for X's side of things in this scenario as well, but that's for a different PR.

As for your other points:

  • Yes, there is a standard way of forming the signed URLs. If a python app X creates a signed URL and sends it to a .NET app Y, it can use SignedUrlUploader.CreateFromSignedUrlAsync with it.
  • This PR does introduce a virtual to "get session URI" and override it: SignedUrlUploader.InitializeUpload. Although it should probably be renamed to GetSessionUriAsync to be much clearer.

@evildour
Copy link
Contributor Author

@jskeet PTAL. After offline discussion with you and Nathan, it looks like these are the scenarios to be supported. They involve authenticated app X communicating with unauthenticated app Y (possibly via Xc, an unauthenticated client of X). S is the signed URL to which a POST request can be made to initiate a resumable upload and U is the session URI for the upload session.

  1. Scenario 1
    a. X creates S using UrlSigner.Sign and sends it to Y.
    b. Y receives S and uses ResumableUpload.CreateFromSignedUrl to get a ResumableUpload instance. It then uses UploadAsync on that instance which does the following: POST to S and get U then perform PUT requests to U to upload the file.
  2. Scenario 2
    a. X initiates resumable upload by performing POST (not using signed URL) and sends the resulting U to Y. (This is the only part not handled by this PR)
    b. Y receives U and uses ResumableUpload.CreateFromSessionUri to get a ResumableUpload instance. It then uses UploadAsync on that instance which performs PUT requests to U to upload the file.
  3. Scenario 3
    a. X creates S using UrlSigner.Sign (same as 1a) and sends it to Xc.
    b. Xc receives S and uses ResumableUpload.GetSessionUriAsync to POST to S and get U, which it then sends to Y.
    c. Y receives U and uses ResumableUpload.CreateFromSessionUri to get a ResumableUpload instance. It then uses UploadAsync on that instance which performs PUT requests to U to upload the file (same as 2b).

Also, I've moved everything pertinent to the review to the top of ResumableUpload. Open questions that still remain: should we use the RESUMABLE HTTP method when creating the signed URL and should we use all strings or all Uri instances?

@jskeet
Copy link
Collaborator

jskeet commented Dec 20, 2016

@evildour: I'm on vacation now until early January - I'll see what I can do anyway, but it may be in bits and pieces.

@evildour
Copy link
Contributor Author

Not a problem. In that case, I may start adding comments to the new APIs at some point.

Copy link
Collaborator

@jskeet jskeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies if I'm being dense again - I'm hoping we can simplify this further.


private ResumableUploadSessionOptions Options { get; }

private string SignedUrl { get; set; }

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@jskeet
Copy link
Collaborator

jskeet commented Jan 4, 2017

Oh, and your other questions:

  • It sounds like we need to use RESUMABLE when creating the signed URL, so that it'll be compatible with Python. Or at least, this way presumably we can create the same URL as in Python, which is probably a good thing.
  • I'm not overly worried about string vs Uri - it may well become more obvious when we've got the final shape of the API.

@evildour
Copy link
Contributor Author

evildour commented Jan 6, 2017

For the RESUMABLE method, that never goes into the signed URL. The URL is created as if you had used the POST method. "RESUMABLE" is just a sentinel value meaning 'please create a signed URL with the POST method and the x-goog-resumable header value of "start" (and all other info specified by the caller)'. I don't think there would be any conflict if we went with something else. The only issue would probably be potential user confusion.

And I agree about string vs. Uri - let's wait until everything else is clearer.

@jskeet
Copy link
Collaborator

jskeet commented Jan 6, 2017

I'm now away from work for a week - will see if I can find 15 minutes to try to get my head back into this. We'll get there :)

@evildour
Copy link
Contributor Author

evildour commented Jan 6, 2017

Not a problem. I'm a bit swamped with some other stuff anyway, so I probably won't have much time for this either until later next week or the week after.

@jskeet
Copy link
Collaborator

jskeet commented Jan 17, 2017

Okay, I think I now understand what's going on, and it's looking good. The tricky bit is understanding how much the existing ResumableUpload is changing due to this. Obviously we've gone back and forth via quite huge changes - is it now reasonably simple? I can see the addition of ResumableUploadOptions, and is the rest about how the session URI is managed? I suspect so.

I think this is now the form we want... I'm sufficiently convinced that it would worth creating a PR on google-api-dotnet-client to make that diff clear, if that's feasible. Happy to have a VC to iron out any lingering doubts though.

@evildour
Copy link
Contributor Author

Yes, as far as what's changing in google-api-dotnet-client, its the addition of ResumableUploadOptions and the non-generic ResumableUpload and the rest is about initiating the session and managing the session URI.

I'll try to put together a PR for google-api-dotnet-client later today or tomorrow so we can get this in. I'll also leave this PR open for now just in case that process reveals something we haven't thought of.

Thanks!

@jskeet
Copy link
Collaborator

jskeet commented Feb 10, 2017

Now that there's been a new storage release, presumably this can be modified so it's releasable?

@evildour
Copy link
Contributor Author

Yes. About to push an update. Although I got a little confused with the visibility of MediaApiErrorHandling, forgetting that it was internal to Google.Apis. So I had to hack things a bit in the static SignedUrlResumableUpload.InitiateSessionAsync by creating a temporary instance to initiate the session. I think it should be fine, but let me know if you feel different.

@evildour evildour changed the title NOT FOR SUBMISSION Signed URL resumable upload support Signed URL resumable upload support Feb 10, 2017
@evildour
Copy link
Contributor Author

This is ready for submission now. PTAL.

Copy link
Collaborator

@jskeet jskeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, but generally I think we're there. Hooray :)

name,
duration,
UrlSigner.ResumableHttpMethod,
requestHeaders: new Dictionary<string, IEnumerable<string>> {

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

return uploader.InitiateSessionAsync(cancellationToken);
}

private static T GetResult<T>(Func<CancellationToken, Task<T>> operation)

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

{
requestMethod = HttpMethod.Get;
}
else if (requestMethod.Equals(ResumableHttpMethod))

This comment was marked as spam.

This comment was marked as spam.

ResumableUploadOptions options = null,
CancellationToken cancellationToken = default(CancellationToken))
{
var uploader = new SignedUrlResumableUpload(signedUrl, new MemoryStream(), options);

This comment was marked as spam.

This comment was marked as spam.

@evildour
Copy link
Contributor Author

Thanks Jon! Will merge when green, unless you have additional comments.

@evildour evildour merged commit bf6c299 into googleapis:master Feb 10, 2017
@evildour evildour deleted the resumable-uploads branch February 10, 2017 18:06
@jskeet
Copy link
Collaborator

jskeet commented Feb 10, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants