-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Deadlock in grpc due to recursive grpc call #2515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This may be fixed at head, this code has been traditionally deadlock prone, and is currently being rewritten. cc: @zhangkun83 |
Well, the deadlock is one issue, but for my purposes, it's just a symptom. My core need is the ability to avoid sending a log to the logging service when LoggingHandler.publish is called in a grpc worker thread. Do you have any suggestions for that? |
Symptom-wise, The reason for the reversed lock order, is that channel calls the java logger while holding the channel lock, while the logger unexpectedly starts an RPC. IIUC, even if there weren't deadlock, it would be an infinite recursion instead. Off the top of my head, you probably want to remove the RPC-based log handlers from |
Right, the infinite recursion is what I want to prevent. Simply removing the RPC-based log handlers from |
What we need is a per-(gRPC)channel log configuration, so that the channel used by logger would not use the RPC-based logging. It doesn't seem possible with the Java logging API. We could create a custom logging API for gRPC that can be configured on a channel basis, but it doesn't seem feasible for netty. |
One alternative is for gRPC to set a ThreadLocal that google-cloud-java code could read in order to determine if it is in the context of a gRPC worker thread. |
I don't fully understand this. Do you mean you want grpc and netty to also log to the RPC-based handler?
How would it work? I was thinking the other way: the RPC-based logger could set the ThreadLocal context, like this: static final Context.Key<Boolean> IN_CLOUD_LOGGER_KEY;
publish() {
Boolean inCloudLogger = IN_CLOUD_LOGGER_KEY.get();
if (inCloudLogger == null) {
Context newCtx = Context.current().withValue(IN_CLOUD_LOGGER_KEY, true);
Context origCtx = newCtx.attach();
try {
// Do the RPC-based logging
...
} finally {
newCtx.detach(origCtx);
}
} else {
// Do the non-RPC-based logging
}
} |
We want to publish logs from grpc calls, but we wouldn't want the flush() call to run in the grpc worker thread (because it makes the call to the logging service) .
I dug into the docs for Context and Context.Key and I think I understand it. It basically looks like this feature will work for me and I can already put it to use. |
Closing this issue as it's not a bug of gRPC per se, and a solution has been accepted. |
Please answer these questions before submitting your issue.
What version of gRPC are you using?
1.0.1
What JVM are you using (
java -version
)?openjdk version "1.8.0_102"
OpenJDK Runtime Environment (build 1.8.0_102)
OpenJDK 64-Bit Server VM (build 25.102-b01, mixed mode)
What did you do?
If possible, provide a recipe for reproducing the error.
What did you expect to see?
Not what I saw below...
What did you see instead?
Deadlock.
Background: I have been trying various strategies to resolve googleapis/google-cloud-java#1386 , where using the Logging service at level FINE results in grpc logging to the Logging service in a recursive way. I tried using a ThreadLocal to prevent this, but this doesn't work with grpc because the actual call is executed on a worker thread. Essentially I think I need some way to bail out of the LoggingHandler.publish call if I can detect that this is in the scope of a grpc worker thread sending a request to the Logging service.
The text was updated successfully, but these errors were encountered: