-
Notifications
You must be signed in to change notification settings - Fork 242
A76 update: Clarify behaviors discovered during implementation #489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I discovered a few behaviors that I'm not sure are desired (although I don't think they are harmful either), and that may cause confusion for implementors, so definitely worth noting: 1. If multiple concurrent RPCs use the same picker, and the picker has no endpoint in connecting state, then both may trigger a connection attempt, resulting in multiple endpoints leaving Idle. This happens until the picker is updated to update `picker_has_endpoint_in_connecting_state`. One thing we could do to avoid this, at least in Go, is to atomically set `picker_has_endpoint_in_connecting_state` to true when we trigger a connection attempt. 2. If we queued the pick because there was no endpoint in `Ready` state and there was endpoints in `Idle` or connecting state, then when the endpoint transitions to Ready, we may end up triggering a connection attempt again on the next pick for a given RPC. 3. The behavior is not specified when multiple endpoints have the same hash key. Looking at Envoy's code, I think there is no deliberate choice of endpoint in that case, so we could leave it unspecified, but I think it's worth clarifying.
d54b4c5
to
37defbb
Compare
cc @markdroth those were finding as part of the implementation: grpc/grpc-go#8159 Let me know if those behaviors seem acceptable (I think they are) or if you'd like us to revisit something. I've added more info in the PR description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing this up!
@@ -157,11 +157,19 @@ if !using_random_hash { | |||
return ring[first_index].picker->Pick(...); | |||
``` | |||
|
|||
This behavior ensures that a single RPC does not cause more than one endpoint to | |||
This behavior ensures that a single RPC does not cause more than two endpoint to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we say that it's bounded to 2, but I think the problem described in the next paragraph means that it's actually not bounded at all. It's conceivably possible to have a case where all endpoints are in IDLE state and we get a bunch of picks on the current picker in parallel, where each pick could trigger a connection attempt on a different IDLE endpoint before the picker gets updated.
If this is indeed completely unbounded, then I think it would make sense for us to use an atomic for picker_has_endpoint_in_connecting_state
and do a CAS loop to set it, so that we can guarantee that we'll trigger only one connection attempt at a time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a test in C-core that proves that the number of simultaneous connection attempts is indeed unbounded, and proves that using an atomic fixes this problem:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bunch of picks on the current picker in parallel
That means you have multiple RPCs. The text here is that a single RPC can't cause a bunch of connections. I thought we've always accepted that multiple RPCs cause more aggressive connection creation, and can even consider that good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's a good point. We are actually okay with starting a separate connection attempt for each RPC based on its hash -- we do the same thing in the xDS case, where the random hash for a given RPC is stable across multiple pick attempts for that RPC. (In the non-xDS case, we will actually inhibit those multiple connection attempts once the picker gets updated due to picker_has_endpoint_in_connecting_state
, but that's probably fine.)
Given that, I think we can stick with the current implementation without using an atomic. However, let's change the paragraph below to clarify that we can still start multiple simultaneous connection attempts for different RPCs, just not for a single RPC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I've updated the paragraph. How do you all feel about the second question from the PR description:
If we queued the pick because there was no endpoint in
READY
state and there was endpoints inIDLE
orCONNECTING
state, then if a queued pick is triggered when all endpoints inCONNECTING
have transitioned to eitherREADY
orTRANSIENT_FAILURE
, we may end up triggering a second connection attempt again on the next pick for an RPC that already triggered one.
I ran into this when asserting that a single RPC only wakes up a single endpoint, when there are
- The picker randomly chooses an endpoint in
IDLE
state, triggers a first connection attempt to that backend. The pick is queued. - The picker is updated when the endpoint transitions to
Connecting
state, no new connection is triggered becausepicker_has_endpoint_in_connecting_state
is true. There is no endpoint inREADY
state. The pick is queued again. - The picker is updated when the endpoint transitions to
Ready
state. Since there is no other endpoint inConnecting
, if the random endpoint is not the one inREADY
state (probability$\frac{n-1}{n}$ ), then a second connection attempt is triggered.
Just want to make sure you're OK with this.
@@ -178,6 +186,10 @@ endpoint attribute to the value of [LbEndpoint.Metadata][LbEndpoint.Metadata] | |||
`envoy.lb` `hash_key` field, as described in [Envoy's documentation for the ring | |||
hash load balancer][envoy-ringhash]. | |||
|
|||
If multiple endpoints have the same `hash_key`, then which endpoint is chosen is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will note that this problem is not strictly new in this gRFC, because it's conceivably possible for there to be hash collisions even in the original design. But I agree that the fact that we're allowing hash keys to be specified manually makes it more likely, so I'm fine with noting this here.
I discovered a few behaviors that I'm not sure are desired (although I don't think they are harmful either), and that may cause confusion for implementors, so definitely worth noting:
If multiple concurrent RPCs use the same picker, and the picker has no endpoint in connecting state, then both may trigger a connection attempt, resulting in multiple endpoints leaving Idle. This happens until the picker is updated to update
picker_has_endpoint_in_connecting_state
. One thing we could do to avoid this, at least in Go, is to atomically setpicker_has_endpoint_in_connecting_state
to true when we trigger a connection attempt.If we queued the pick because there was no endpoint in
Ready
state and there was endpoints inIdle
or connecting state, then when the endpoint transitions to Ready, we may end up triggering a connection attempt again on the next pick for a given RPC.The behavior is not specified when multiple endpoints have the same hash key. Looking at Envoy's code [1], I think there is no deliberate choice of endpoint in that case, so we could leave it unspecified, but I think it's worth clarifying.
[1] https://github.com/envoyproxy/envoy/blob/9faa0a3c19018bd23985d1e0aaa94cd5546c9c35/source/extensions/load_balancing_policies/ring_hash/ring_hash_lb.cc#L141-L238