Skip to content

[repository/downloader] Add support for multiple header values #16260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Yannic
Copy link
Contributor

@Yannic Yannic commented Sep 13, 2022

Headers can generally be specified multiple times, not just once.

This refactoring is prerequisite for adding support for getting credentials from the credential helper.

Note that this does not change the Starlark API or the qualifier for the gRPC remote downloader (both of which need to go through the process for incompatible changes - which should hopefully be pretty straight forward to do as this feature doesn't seem to be used that much AFAICT).

Progress on #15856

Headers can generally be specified multiple times, not just once.

This refactoring is prerequisite for adding support for getting
credentials from the credential helper.

Note that this does not change the Starlark API or the qualifier
for the gRPC remote downloader (both of which need to go through the
process for incompatible changes - which should hopefully be pretty
straight forward to do as this feature doesn't seem to be used
that much AFAICT).

Progress on bazelbuild#15856
@Yannic Yannic requested a review from a team as a code owner September 13, 2022 10:16
@Yannic
Copy link
Contributor Author

Yannic commented Sep 13, 2022

@tjgq PTAL

@sgowroji sgowroji added team-Remote-Exec Issues and PRs for the Execution (Remote) team awaiting-review PR is awaiting review from an assigned reviewer labels Sep 13, 2022
Comment on lines +207 to +211
// TODO(yannic): Introduce incompatible flag for including all headers, not just the first.
String value = Iterables.getFirst(subEntry.getValue(), null);
if (value != null) {
subObject.addProperty(subEntry.getKey(), value);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me why this is needed. In StarlarkBaseExternalContext#getAuthHeaders, the map is populated from a Starlark dict, which by construction cannot contain multiple values for the same key. Am I missing some other code path where the map could be populated with multiple values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The credential helper returns a map<string, list<string>> for headers, so we need this change for adding the headers from the helper. We could implement it in a way that only passes the first (or last) value for the header to the remote server, but that doesn't seem like a good idea IMO.

We can allow setting multiple headers from Starlark by allowing Dict<String, Sequence<String>> in addition to Dict<String, String>, but that would be a follow-up.

Copy link
Contributor

@tjgq tjgq Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this change is necessary. What I don't understand is why GrpcRemoteDownloader#addHeadersJson must only call addProperty on the first header value for a given key, and ignore the rest.

  • If you're not using a credential helper, headers should come solely from the Starlark dict (unless I'm missing something), therefore there won't be more than one value per key.
  • If you are using a credential helper, adding multiple values per key doesn't have to be considered a breaking change, since credential helpers are a new feature (again, unless I'm missing something).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON is passed as Qualifier in the gRPC download request. The server may or may not look at the JSON (it's not a well-specified part of the assets API), so switching to multiple values here has potential to break things.

We could make the JSON dependent on where a header comes from. But we probably want to merge the headers from Starlark and the credential helper (at least that's my understanding on how the 2 features work together?), and sending a different qualifier based on where credentials come from seems very confusing. That said, with my security hat on, I'm not sure the headers should even be part of the qualifier (esp. in plaintext). The digest of the headers should be sufficient, if sending anything at all.

Copy link
Contributor

@tjgq tjgq Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON is passed as Qualifier in the gRPC download request. The server may or may not look at the JSON (it's not a well-specified part of the assets API), so switching to multiple values here has potential to break things.

But the breakage could only possibly happen if you're already setting --experimental_credential_helper, right? I would argue that this falls under "APIs and behavior guarded by an --experimental_* flag can change at any time" (https://bazel.build/release/backward-compatibility) and an --incompatible flag might not be necessary.

We could make the JSON dependent on where a header comes from. But we probably want to merge the headers from Starlark and the credential helper (at least that's my understanding on how the 2 features work together?), and sending a different qualifier based on where credentials come from seems very confusing.

I agree that the qualifier should not depend on the provenance of the headers. Although something to ponder is what would happen if the Authentication header is set to the same value from both the Starlark and credential helper sides. Would that cause the header to be sent twice, and if so, would that still work?

That said, with my security hat on, I'm not sure the headers should even be part of the qualifier (esp. in plaintext). The digest of the headers should be sufficient, if sending anything at all.

It might be worth raising this issue with the Remote Execution API working group (but we don't have to block on it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I think it will be confusing no matter what we do. What's the expected behavior if you set both --experimental_credential_helper and the Starlark attribute, but don't set the incompatible flag? Forwarding only one of the headers is guaranteed to be a surprise to someone, even if we make a deterministic choice.

That being said, I won't be opposed to adding the incompatible flag if you so insist. The PR otherwise LGTM (but I'd like @Wyverald to chip in as well, in case there's some subtlety around repository fetching that I don't fully understand).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the incompatible flag. Note that I plan to also send a map<string, list<string>> if the incompatible flag is set even if there's no helper involved (in which case we are consistent and unsurprising eventually).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm, we don't need an incompatible flag for the repo fetching case, right? The Starlark API can just accept both types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that would work (although, haven't tried).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I assume you're talking about repository_ctx.download(auth={...}) here)

@tjgq
Copy link
Contributor

tjgq commented Sep 13, 2022

@Wyverald Could you please also take a look at this PR, in particular the bits that affect repository fetching?

Copy link
Member

@Wyverald Wyverald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the repo fetching part.

aiuto pushed a commit to aiuto/bazel that referenced this pull request Oct 12, 2022
Headers can generally be specified multiple times, not just once.

This refactoring is prerequisite for adding support for getting credentials from the credential helper.

Note that this does not change the Starlark API or the qualifier for the gRPC remote downloader (both of which need to go through the process for incompatible changes - which should hopefully be pretty straight forward to do as this feature doesn't seem to be used that much AFAICT).

Progress on bazelbuild#15856

Closes bazelbuild#16260.

PiperOrigin-RevId: 474770302
Change-Id: I74620e36481414a41108991bc61321d104a6d39a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review PR is awaiting review from an assigned reviewer team-Remote-Exec Issues and PRs for the Execution (Remote) team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants