-
Notifications
You must be signed in to change notification settings - Fork 746
Potential Deadlock #3545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @tunds - I think you're experiencing deadlocks in the websocket. We've released several versions recently to try mitigate the bug; 1.15.3, 1.16.1, and 1.18.0. Since you're on 1.17.0, I recommend updating to 1.18.0 and monitoring it. We've seen improved stability of the websocket since that release. For now I'm going to close this issue since I believe we have the issue resolved. If however, after updating to 1.18.0, you still notice the issue please comment back here and we can re-open and investigate. |
Do you have any feedback for the maintainers? Please tell us by taking a one-minute survey. Your responses will help us understand Apollo iOS usage and allow us to serve you better. |
Hi @calvincestari, Unfortunately we're going to have to reopen this issue. We've updated to 1.18.0 and we're still seeing the issue that I have posted above. |
@tunds - that's fine. You'll need to try put together something that can reproduce the issue though. This has been an incredibly difficult issue to reliably reproduce and the sample app we had no longer fails our test cases. If you have suggested changes for the websocket code we'll gladly review it. |
Hi @calvincestari, Yh I can imagine this isn't an easy one to fix at all since it's not really obvious. One thing I will point out is it seems to occur quite often when we're using our company VPN and a proxying tool i.e. Proxyman. Sometimes this VPN can disconnect and timeout causing the websocket to auto retry connecting in the background, whilst using the proxying tool. So I wonder if the best way to try to reproduce this is by causing the system to change IP very frequently in a short amount of time which would cause this to reconnect code to fire rapidly possibly? It's more the steps for this scenario that need to be taken here. |
@tunds, we have another release coming out later today that includes a change for websocket deadlocks. Not sure if it'll fix your issue but it is in the same code and would be worth a try. |
@calvincestari Awesome news!!!, Will update you. Happy for you to close this issue and then I can leave a comment for you to reopen if we find the issue again. Unless you want to keep it open to monitor this? |
We can leave this one open for now. |
I have seen the same issue on 1.21.0 again unfortunately:( |
We're going to be re-writing the entire web socket layer for 2.0 in the coming months. We've spent a lot of time chasing down race conditions in the existing web socket layer in the past few months, but in order to completely remove these race conditions and dead locks, a rewrite is really necessary. If you'd like to take a crack at a PR to address this, we're happy to take a look, but we likely won't be investing much more time into tracking these down and instead will focus our efforts on the rewrite. |
No worries, appreciate the effort by the team to look into this. Looking forward to 2.0.0 and the great things you bring 🫡 |
Summary
I've noticed that we have some scenarios when using the Apollo GraphQL subscriptions can deadlock when attempting to reconnect due to network disruptions. We have noticed this can happen quite often when we are using our internal VPN (GlobalProtect) if there is a disconnect and the subscriptions attempt to reconnect the application will freeze.
When inspecting what is going on the main thread this function seems to be potentially causing the app to hang.
This all starts from the
attemptReconnectionIfDesired
function.Version
1.17.0
Steps to reproduce the behavior
Logs
Anything else?
The text was updated successfully, but these errors were encountered: