Skip to content

A76 update: Clarify behaviors discovered during implementation #489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

atollena
Copy link
Contributor

@atollena atollena commented Apr 3, 2025

I discovered a few behaviors that I'm not sure are desired (although I don't think they are harmful either), and that may cause confusion for implementors, so definitely worth noting:

  1. If multiple concurrent RPCs use the same picker, and the picker has no endpoint in connecting state, then both may trigger a connection attempt, resulting in multiple endpoints leaving Idle. This happens until the picker is updated to update picker_has_endpoint_in_connecting_state. One thing we could do to avoid this, at least in Go, is to atomically set picker_has_endpoint_in_connecting_state to true when we trigger a connection attempt.

  2. If we queued the pick because there was no endpoint in Ready state and there was endpoints in Idle or connecting state, then when the endpoint transitions to Ready, we may end up triggering a connection attempt again on the next pick for a given RPC.

  3. The behavior is not specified when multiple endpoints have the same hash key. Looking at Envoy's code [1], I think there is no deliberate choice of endpoint in that case, so we could leave it unspecified, but I think it's worth clarifying.

[1] https://github.com/envoyproxy/envoy/blob/9faa0a3c19018bd23985d1e0aaa94cd5546c9c35/source/extensions/load_balancing_policies/ring_hash/ring_hash_lb.cc#L141-L238

I discovered a few behaviors that I'm not sure are desired (although I don't
think they are harmful either), and that may cause confusion for implementors,
so definitely worth noting:

1. If multiple concurrent RPCs use the same picker, and the picker has no
   endpoint in connecting state, then both may trigger a connection attempt,
   resulting in multiple endpoints leaving Idle. This happens until the picker
   is updated to update `picker_has_endpoint_in_connecting_state`. One thing we
   could do to avoid this, at least in Go, is to atomically set
   `picker_has_endpoint_in_connecting_state` to true when we trigger a
   connection attempt.

2. If we queued the pick because there was no endpoint in `Ready` state and
   there was endpoints in `Idle` or connecting state, then when the endpoint
   transitions to Ready, we may end up triggering a connection attempt again on
   the next pick for a given RPC.

3. The behavior is not specified when multiple endpoints have the same hash key.
   Looking at Envoy's code, I think there is no deliberate choice of endpoint in
   that case, so we could leave it unspecified, but I think it's worth
   clarifying.
@atollena atollena force-pushed the A76-clarifications branch from d54b4c5 to 37defbb Compare April 3, 2025 08:30
@atollena
Copy link
Contributor Author

atollena commented Apr 3, 2025

cc @markdroth those were finding as part of the implementation: grpc/grpc-go#8159

Let me know if those behaviors seem acceptable (I think they are) or if you'd like us to revisit something. I've added more info in the PR description.

@atollena atollena changed the title [A76] Clarify behaviors discovered during implementation A76: Clarify behaviors discovered during implementation Apr 3, 2025
@atollena atollena changed the title A76: Clarify behaviors discovered during implementation A76 update: Clarify behaviors discovered during implementation Apr 3, 2025
Copy link
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing this up!

@@ -157,11 +157,19 @@ if !using_random_hash {
return ring[first_index].picker->Pick(...);
```

This behavior ensures that a single RPC does not cause more than one endpoint to
This behavior ensures that a single RPC does not cause more than two endpoint to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we say that it's bounded to 2, but I think the problem described in the next paragraph means that it's actually not bounded at all. It's conceivably possible to have a case where all endpoints are in IDLE state and we get a bunch of picks on the current picker in parallel, where each pick could trigger a connection attempt on a different IDLE endpoint before the picker gets updated.

If this is indeed completely unbounded, then I think it would make sense for us to use an atomic for picker_has_endpoint_in_connecting_state and do a CAS loop to set it, so that we can guarantee that we'll trigger only one connection attempt at a time.

@ejona86, @dfawley, thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test in C-core that proves that the number of simultaneous connection attempts is indeed unbounded, and proves that using an atomic fixes this problem:

grpc/grpc#39150

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bunch of picks on the current picker in parallel

That means you have multiple RPCs. The text here is that a single RPC can't cause a bunch of connections. I thought we've always accepted that multiple RPCs cause more aggressive connection creation, and can even consider that good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's a good point. We are actually okay with starting a separate connection attempt for each RPC based on its hash -- we do the same thing in the xDS case, where the random hash for a given RPC is stable across multiple pick attempts for that RPC. (In the non-xDS case, we will actually inhibit those multiple connection attempts once the picker gets updated due to picker_has_endpoint_in_connecting_state, but that's probably fine.)

Given that, I think we can stick with the current implementation without using an atomic. However, let's change the paragraph below to clarify that we can still start multiple simultaneous connection attempts for different RPCs, just not for a single RPC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I've updated the paragraph. How do you all feel about the second question from the PR description:

If we queued the pick because there was no endpoint in READY state and there was endpoints in IDLE or CONNECTING state, then if a queued pick is triggered when all endpoints in CONNECTING have transitioned to either READY or TRANSIENT_FAILURE, we may end up triggering a second connection attempt again on the next pick for an RPC that already triggered one.

I ran into this when asserting that a single RPC only wakes up a single endpoint, when there are $n$ idle endpoints:

  1. The picker randomly chooses an endpoint in IDLE state, triggers a first connection attempt to that backend. The pick is queued.
  2. The picker is updated when the endpoint transitions to Connecting state, no new connection is triggered because picker_has_endpoint_in_connecting_state is true. There is no endpoint in READY state. The pick is queued again.
  3. The picker is updated when the endpoint transitions to Ready state. Since there is no other endpoint in Connecting, if the random endpoint is not the one in READY state (probability $\frac{n-1}{n}$), then a second connection attempt is triggered.

Just want to make sure you're OK with this.

@@ -178,6 +186,10 @@ endpoint attribute to the value of [LbEndpoint.Metadata][LbEndpoint.Metadata]
`envoy.lb` `hash_key` field, as described in [Envoy's documentation for the ring
hash load balancer][envoy-ringhash].

If multiple endpoints have the same `hash_key`, then which endpoint is chosen is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will note that this problem is not strictly new in this gRFC, because it's conceivably possible for there to be hash collisions even in the original design. But I agree that the fact that we're allowing hash keys to be specified manually makes it more likely, so I'm fine with noting this here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants