-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
hyper-util: Expose attempted SocketAddr in failed request #3888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm working in a scenario where physical load balancers do not exist and are not an option. As a result, I need to implement a client-side load balancing strategy. The necessary hooks and extensions exist to control how DNS is queried and candidate endpoints are provided for selection, but I also need to know what the candidate endpoint was when a connection fails. I require this information so that I can mark down and weigh away from that endpoint in the candidate endpoint selection logic. I don't think all of the candidate endpoints need to be included in the error. The final candidate that propagates the error should be sufficient. In my case, I ensure the candidate list is always a sequence of exactly one endpoint (which makes correlation simple). |
Interesting. I'll just include here that currently, the Unless you're specifically interested in the period between a
How do you do this? Through DNS? Some custom connector? Would that not mean you can already correlate the failed address in that case? |
Unfortunately, I don't have something OSS that I can share, but let me add some more context that will hopefully paint the picture. Consider the following snippets and pseudocode: struct Endpoint {
address: SocketAddr,
weight: usize,
last_computed: Instant,
}
impl Endpoint {
fn fail(&mut self) {
self.weight = 0;
self.last_computed = Instant::now();
}
fn weigh(&mut self) { /* re-weigh endpoint to eventual recovery */ }
}
#[derive(Clone)]
struct State(Arc<Mutex<Vec<Endpoint>>>); The custom DNS resolver component is responsible for periodically querying DNS and generated endpoints. There's nothing that special happening. It's good ol' DNS resolution, but mapped with a weight. Each time this happens, the weight of existing endpoints are copied over so that the mere act of refreshing doesn't clear any weights. Old endpoints are removed. New endpoints come in at full health. The result is updated in the shared To ensure that only a single candidate is handed out at a time, the resolved result is always The other side, the side that's missing, is middleware that can inspect what's happening in a response.
A single client could be making multiple requests concurrently. In order to mark an endpoint as failed, the endpoint in the response needs to be correlated. This is provided in I can elaborate more if that isn't clear enough. |
You might want to consider performing the DNS resolution above the Hyper layer, so you can map IPs per connection out explicitly. |
Yea, all that sounds fairly standard load balancing. We did similar in the linkerd2-proxy. We combined That's not to say we can't figure out improvements in hyper-util. We certainly can do that too. |
The DNS resolution is happening at a much higher level and not directly in the path of |
OK, let me try discussing this from a different angle. You said your code is proprietary, no worries. Can you sketch out some code of how, after you've done your load balancing and such, that you would hope to call hyper? Do you supply a custom resolver? Is it just the default, and you've edited the DNS results somewhere else? What does a solution look like here, as a user call? |
That feels like a reasonable ask. Let me put something together that I can share. Yes, I am using a custom resolver. The resolved addresses are mapped into an |
There's more than one possible solution here. Rather than try to return the failed endpoint, which could arguably be the first or the last candidate in the list, the resolver could add support to handle rejections. This would be backward compatible with a default implementation. The mark down behavior would be split between the resolver and the middleware. The middleware would only handle I imagine it would look something like this: pub trait Resolve {
type Addrs: Iterator<Item = SocketAddr>;
type Error: Into<Box<dyn std::error::Error + Send + Sync>>;
type Future: Future<Output = Result<Self::Addrs, Self::Error>>;
fn poll_ready(&mut self, cx: &mut task::Context<'_>) -> Poll<Result<(), Self::Error>>;
fn resolve(&mut self, name: Name) -> Self::Future;
/// Notifies the resolver that the specified candidate was rejected.
fn reject(&mut self, candidate: &SocketAddr) {
// does nothing by default
}
} Since |
Hm. So, I could update it so the However, this is probably pretty fragile. And it doesn't account for any other middleware you may have, or any errors that might occur in-between connection established but before request sent. With all that I've read so far, I'd recommend towards making your request scope aware of the endpoint that will be resolved. Either in a task-local, in the request extensions, or just as part of the middleware. let target_addr = ([127, 0, 0, 1], 8080).into();
endpoints.resolve_to(target_addr);
match client.request(some_req).await {
Ok(_) => (),
Err(err) => {
endpoints.mark(target_addr);
// etc
}
} |
I can get behind not having it on the error. I certainly agree there's not a clear expectation for the caller whether the first or last candidate is returned. You didn't comment on it, but did you not like the idea of signaling to the resolver that its candidate was rejected (or some other similar word)? IMHO this has a much clearer chain of responsibility that wouldn't affect any middleware. This behavior also doesn't necessarily have to be on the resolver. This could be some other type of service that is notified which an attempt address fails. The default implementation simply does nothing. |
No, I'd rather not add such a method to |
Is your feature request related to a problem? Please describe.
For the purposes of client-side load balancing, its important to be able to track which IP addresses are failing. Currently, its possible to capture the
HttpInfo
withcapture_request
, but only when the Http connection actually succeeds. If it fails, theconn_info
extras are not written into the extensions:https://github.com/hyperium/hyper-util/blob/7e742487c0680697857fd8a3eb403e0432c0d894/src/client/legacy/client.rs#L343-L346
Describe the solution you'd like
Unconditionally write HttpInfo into the request extensions. This probably requires stashing it in the ConnectorError, at least privately.
Additional context
cc @commonsensesoftware
The text was updated successfully, but these errors were encountered: