Fix ACS client not shutting down cleanly if there is a pending response from the server during shutdown. #61
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix ACS client not shutting down cleanly if there is a pending response from the server during shutdown.
In AcsAgentClient, we have a restart_client_thread_ that tries to re-create the reactor when the RPC is being terminated. To know when the RPC is being terminated, we rely on the
AcsAgentClientReactor::OnReadDone(bool ok)
, in which the booleanok
would indicate whether RPC is being closed. However, this does not differentiate whether the RPC is proactively cancelled by client or server. Ideally, we do not want to restart the client when the client itself wants to cancel the RPC.Instead of passing a simple boolean ok to AcsAgentClient, we define a new enum
RpcStatus
, to tell AcsAgentClient whether the RPC is ok, canceled by client, or canceled by server. If it is canceled by client, then we should not trigger the RestartClient() function.This change will allow us to destruct the reactor in AcsAgentClient::Shutdown(), instead of calling just reactor_->Cancel(). Currently, if we directly destruct reactor_ in Shutdown(), we will have a deadlock. when the reactor_ is being reset, the reactor_mtx_ is being held by Shutdown(), and in the meantime, ReactorReadCallback() will try to acquire the same mutex. With this change, in ReactorReadCallback(), when we know the RPC is canceled by client, we will not try to acquire the reactor_mtx_ lock and instead directly return. Then, we are able to cleanly destruct the reactor object in Shutdown() function.
The consequence of not destructing reactor_ in Shutdown() is that when the client is getting destructed after Shutdown() is called, read_callback_ will be destructed earlier than reactor_. If the RPC is not fully terminated in reactor_, reactor_ may try to invoke read_callback_, in this race condition, we could have access destructed field AcsAgentClient.