Skip to content

Add "Metadata" field to Backends #304

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

jamesmunns
Copy link
Contributor

@jamesmunns jamesmunns commented Jun 24, 2024

This is part of an attempt to address #276 (comment), by adding a metadata field that can be any type.

I don't think this is ready to merge, but I'm open to some feedback on the approach, so far.

Open questions/comments:

  • I don't love what this has done to the signatures of a lot of the types
  • I need to take this for a test in river, to see if the API allows me to do what I want:
    • Associate things like TLS_SNI information, obtained through discovery
  • It's possible this interface might allow us to remove some existing "glue" parts, like HttpHealthCheck->peer_template, in favor of actually having the full information needed per Backend.
  • This is almost certainly a breaking change, which maybe could happen in 0.3 soon? CC Will we release the 0.3 sometime soon?  #298.

I'll let you know how it goes!

This is intended to make it easier to have custom metadata fields, as
HashSet only requires Hash and Eq, while BTreeSet also requires Ord.
The aim for this is to allow HttpPeer to implement Eq, which will allow
it to be used as a Metadata field for Backends. This allows users of
pingora-load-balancing to directly store an HttpPeer when selecting
a backend, for use with the HttpProxy trait implementations.
// currently this is only provided to allow HttpPeer to be Eq.
let a: *const dyn Tracing = self.0.deref();
let b: *const dyn Tracing = other.0.deref();
core::ptr::addr_eq(a, b)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is questionable and could use a review

fn eq(&self, other: &Self) -> bool {
// Use destructuring to make sure that we update this for new fields
let CertKey { certificates, key } = self;
certificates.eq(&other.certificates) && key.public_eq(&other.key)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also questionable and could use a review

@jamesmunns
Copy link
Contributor Author

Hey @drcaramelsyrup et. al, I'd be interesting in discussing the viability of this.

I believe I've plumbed through the "metadata" (or "bonus data") appropriately, and existing tests pass. However, this ended up being somewhat invasive, so it'd be good to have design feedback here if you all think this is reasonable.

If you'd like to see the "end goal" of what this enables, check out this PR in river: memorysafety/river#46

1.72 tests are failing because I'm using some newer stdlib features, but I can figure out how to work back the MSRV update if this PR overall seems reasonable to you.

@eaufavor
Copy link
Contributor

Is the meta data part of the unique identifier of the upstream or is it sometime just part of how to connection to the upstream. For example, IP and port identify (define) an unique upstream, while, say, connection timeout is not. The latter is much easier to add.

@jamesmunns
Copy link
Contributor Author

At the moment in the pingora impl: it's treated as if it was part of the identity (e.g., Hash includes the detail of the data, not just the SocketAddr).

At the moment in the river integration: It's the HttpPeer that is created from the config file.

At the moment, that only includes TLS SNI (when relevant), but I do plan to attach additional data like health check routines and such.

I could have a separate HashMap<SocketAddr, Metadata> in River itself, and do the lookup, but I'm not sure if I'll run into challenges later when I want to populate some of that information from Service Discovery, or ensure Backends and my separate HashMap don't get out of sync.

@CodyPubNub
Copy link

This PR implements a feature I find very useful and helps a use case I'm building ontop of Pingora. I have my own "metadata", which is the "age" of a backend, which I utilize in the select_with to ensure that for particular requests, I don't select too new of a backend. Currently this is implemented with a separate HashMap<SocketAddr, Metadata>, of my own, but synchronization of this to the load balancer backends is pretty cumbersome.

I imagine others will find this feature to be extremely useful for similar use cases.

Thanks for the consideration!

@jamesmunns jamesmunns changed the title WIP: Add "Metadata" field to Backends Add "Metadata" field to Backends Jun 27, 2024
@eaufavor
Copy link
Contributor

One possible design to carry opaque meta data is something like this. The upside is that it requires little to none change in the APIs. The downside is that it is totally opaque so that you can't have, say, two backends with the same socket address but different SNIs.

Maybe that is a good enough solution to start with.

@jamesmunns
Copy link
Contributor Author

@eaufavor I think a type erased approach is possible, generally taking Box<dyn Any> and forcing a downcast_ref, as your approach mentions.

To mind, this has two downsides:

  • This would incur more runtime overhead (extra pointer following, downcast checking), probably not hugely significant
  • There's a chance to get things wrong - you could have code "fall out of sync" where the metadata changes in one place, but not others, leading to runtime panics

I don't love the idea of having type erased content there, but I would prefer that to having "no metadata at all".

@eaufavor
Copy link
Contributor

let me take a stab at the idea I proposed

@jamesmunns
Copy link
Contributor Author

@eaufavor provided me with a patch implementing a dynamic (vs generic) version of this API, which I applied in this commit on the memorysafety fork of pingora.

I've integrated it into River with memorysafety/river#56, and it appears to be suitable upon basic testing.

Closing this PR in favor of the changes proposed in the linked patch (which I assume will land in this repo shortly, and hopefully be released soon after), thanks @eaufavor!

@jamesmunns jamesmunns closed this Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants