A76 update: Clarify behaviors discovered during implementation #489

atollena · 2025-04-03T08:28:27Z

I discovered a few behaviors that I'm not sure are desired (although I don't think they are harmful either), and that may cause confusion for implementors, so definitely worth noting:

If multiple concurrent RPCs use the same picker, and the picker has no endpoint in connecting state, then both may trigger a connection attempt, resulting in multiple endpoints leaving Idle. This happens until the picker is updated to update picker_has_endpoint_in_connecting_state. One thing we could do to avoid this, at least in Go, is to atomically set picker_has_endpoint_in_connecting_state to true when we trigger a connection attempt.
If we queued the pick because there was no endpoint in Ready state and there was endpoints in Idle or connecting state, then when the endpoint transitions to Ready, we may end up triggering a connection attempt again on the next pick for a given RPC.
The behavior is not specified when multiple endpoints have the same hash key. Looking at Envoy's code [1], I think there is no deliberate choice of endpoint in that case, so we could leave it unspecified, but I think it's worth clarifying.

[1] https://github.com/envoyproxy/envoy/blob/9faa0a3c19018bd23985d1e0aaa94cd5546c9c35/source/extensions/load_balancing_policies/ring_hash/ring_hash_lb.cc#L141-L238

I discovered a few behaviors that I'm not sure are desired (although I don't think they are harmful either), and that may cause confusion for implementors, so definitely worth noting: 1. If multiple concurrent RPCs use the same picker, and the picker has no endpoint in connecting state, then both may trigger a connection attempt, resulting in multiple endpoints leaving Idle. This happens until the picker is updated to update `picker_has_endpoint_in_connecting_state`. One thing we could do to avoid this, at least in Go, is to atomically set `picker_has_endpoint_in_connecting_state` to true when we trigger a connection attempt. 2. If we queued the pick because there was no endpoint in `Ready` state and there was endpoints in `Idle` or connecting state, then when the endpoint transitions to Ready, we may end up triggering a connection attempt again on the next pick for a given RPC. 3. The behavior is not specified when multiple endpoints have the same hash key. Looking at Envoy's code, I think there is no deliberate choice of endpoint in that case, so we could leave it unspecified, but I think it's worth clarifying.

atollena · 2025-04-03T08:30:59Z

cc @markdroth those were finding as part of the implementation: grpc/grpc-go#8159

Let me know if those behaviors seem acceptable (I think they are) or if you'd like us to revisit something. I've added more info in the PR description.

markdroth

Thanks for bringing this up!

markdroth · 2025-04-03T16:39:22Z

A76-ring-hash-improvements.md

@@ -157,11 +157,19 @@ if !using_random_hash {
 return ring[first_index].picker->Pick(...);
 ```

-This behavior ensures that a single RPC does not cause more than one endpoint to
+This behavior ensures that a single RPC does not cause more than two endpoint to


Here we say that it's bounded to 2, but I think the problem described in the next paragraph means that it's actually not bounded at all. It's conceivably possible to have a case where all endpoints are in IDLE state and we get a bunch of picks on the current picker in parallel, where each pick could trigger a connection attempt on a different IDLE endpoint before the picker gets updated.

If this is indeed completely unbounded, then I think it would make sense for us to use an atomic for picker_has_endpoint_in_connecting_state and do a CAS loop to set it, so that we can guarantee that we'll trigger only one connection attempt at a time.

@ejona86, @dfawley, thoughts?

I've added a test in C-core that proves that the number of simultaneous connection attempts is indeed unbounded, and proves that using an atomic fixes this problem:

grpc/grpc#39150

bunch of picks on the current picker in parallel

That means you have multiple RPCs. The text here is that a single RPC can't cause a bunch of connections. I thought we've always accepted that multiple RPCs cause more aggressive connection creation, and can even consider that good.

Oh, that's a good point. We are actually okay with starting a separate connection attempt for each RPC based on its hash -- we do the same thing in the xDS case, where the random hash for a given RPC is stable across multiple pick attempts for that RPC. (In the non-xDS case, we will actually inhibit those multiple connection attempts once the picker gets updated due to picker_has_endpoint_in_connecting_state, but that's probably fine.)

Given that, I think we can stick with the current implementation without using an atomic. However, let's change the paragraph below to clarify that we can still start multiple simultaneous connection attempts for different RPCs, just not for a single RPC.

Sounds good, I've updated the paragraph. How do you all feel about the second question from the PR description:

If we queued the pick because there was no endpoint in READY state and there was endpoints in IDLE or CONNECTING state, then if a queued pick is triggered when all endpoints in CONNECTING have transitioned to either READY or TRANSIENT_FAILURE, we may end up triggering a second connection attempt again on the next pick for an RPC that already triggered one.

I ran into this when asserting that a single RPC only wakes up a single endpoint, when there are $n$ idle endpoints:

The picker randomly chooses an endpoint in IDLE state, triggers a first connection attempt to that backend. The pick is queued.

The picker is updated when the endpoint transitions to Connecting state, no new connection is triggered because picker_has_endpoint_in_connecting_state is true. There is no endpoint in READY state. The pick is queued again.

The picker is updated when the endpoint transitions to Ready state. Since there is no other endpoint in Connecting, if the random endpoint is not the one in READY state (probability $\frac{n-1}{n}$), then a second connection attempt is triggered.

Just want to make sure you're OK with this.

markdroth · 2025-04-03T16:40:47Z

A76-ring-hash-improvements.md

@@ -178,6 +186,10 @@ endpoint attribute to the value of [LbEndpoint.Metadata][LbEndpoint.Metadata]
 `envoy.lb` `hash_key` field, as described in [Envoy's documentation for the ring
 hash load balancer][envoy-ringhash].

+If multiple endpoints have the same `hash_key`, then which endpoint is chosen is


I will note that this problem is not strictly new in this gRFC, because it's conceivably possible for there to be hash collisions even in the original design. But I agree that the fact that we're allowing hash keys to be specified manually makes it more likely, so I'm fine with noting this here.

atollena force-pushed the A76-clarifications branch from d54b4c5 to 37defbb Compare April 3, 2025 08:30

atollena mentioned this pull request Apr 3, 2025

ringhash: implement gRFC A76 grpc/grpc-go#8159

Merged

atollena changed the title ~~[A76] Clarify behaviors discovered during implementation~~ A76: Clarify behaviors discovered during implementation Apr 3, 2025

atollena changed the title ~~A76: Clarify behaviors discovered during implementation~~ A76 update: Clarify behaviors discovered during implementation Apr 3, 2025

markdroth reviewed Apr 3, 2025

View reviewed changes

ejona86 approved these changes Apr 3, 2025

View reviewed changes

comments from mark, eric

75008fd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A76 update: Clarify behaviors discovered during implementation #489

A76 update: Clarify behaviors discovered during implementation #489

atollena commented Apr 3, 2025 •

edited

Loading

atollena commented Apr 3, 2025

markdroth left a comment

markdroth Apr 3, 2025

markdroth Apr 3, 2025

ejona86 Apr 3, 2025

markdroth Apr 3, 2025

atollena Apr 4, 2025

markdroth Apr 3, 2025

A76 update: Clarify behaviors discovered during implementation #489

Are you sure you want to change the base?

A76 update: Clarify behaviors discovered during implementation #489

Conversation

atollena commented Apr 3, 2025 • edited Loading

atollena commented Apr 3, 2025

markdroth left a comment

Choose a reason for hiding this comment

markdroth Apr 3, 2025

Choose a reason for hiding this comment

markdroth Apr 3, 2025

Choose a reason for hiding this comment

ejona86 Apr 3, 2025

Choose a reason for hiding this comment

markdroth Apr 3, 2025

Choose a reason for hiding this comment

atollena Apr 4, 2025

Choose a reason for hiding this comment

markdroth Apr 3, 2025

Choose a reason for hiding this comment

atollena commented Apr 3, 2025 •

edited

Loading