-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Revert "Let forward requests run until timeout (#2679) + style fixes. #2771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reverts commit 92fa67d.
Signed-off-by: Bartlomiej Plotka <[email protected]>
Let's NOT merge until we test this out. |
We should have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is what we want, with this change we'll again see high error rates even when 2 writes succeed and the other is still in flight. In other words this is okayish for testing the theory (though we might confuse the source of errors), but this is not code we should merge
I don't think error rates are the problem here - we can easily change the instrumentation to properly report those in separate PR. It's better than not being able to handle the load (: |
Yes of course we can change the instrumentation, and if we choose to go this route then I think we need to do that before merging the PR since it will otherwise conflate real errors with expected cancellations |
How mergin this in this state is worse than not being able to use master at all? (: |
but yea agree, something to fix ASAP, let's deploy this first and check |
Looks like after all it's isolation fix not this, closing. |
Let's revert the commit in the path of the problematic handler we saw on production.
I think the PR #2679 made sense, we want to replicate 3x, just 2x strictly and 3rd one best effort.
While it makes sense logically, there are some reasons why the receive could be saturated because of this change e.g
Essentially with this change indeed we have "leaking" rate-limiting (if there is any;p), because we claim request end, despite things are still processing.
Anyway, let's build an image from this branch & deploy it and let's see if we can repro. There are solid reasons why this might attribute for potential saturation