-
Notifications
You must be signed in to change notification settings - Fork 521
[apiserver][feat] Add read and write timeouts to HTTP server #3350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[apiserver][feat] Add read and write timeouts to HTTP server #3350
Conversation
Signed-off-by: You-Cheng Lin <[email protected]>
@dentiny PTAL |
Lines 148 to 159 in 8944703
Haha just saw you added the timeout I'll just close this pr then. |
Hi @owenowenisme , my PR change is only made to make linter happy thus no-op change, while you're adding timeout and delivering true value. :) |
server := &http.Server{ | ||
Addr: *httpPortFlag, | ||
Handler: topMux, | ||
ReadTimeout: 5 * time.Second, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have an ideal recommendation here, usually there're two approaches to set timeout:
- Set timeout to a large value so it behaves the same as no timeout for most cases, but does help cut down tail latency and trigger retry (i.e. half a minute) (could decrease timeout later after collection stats and user feedbacks);
- (more scientific) have apiserver and kuberay operator deployed, to check what's the rough duration for operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC @kevin85421 / @MortalHappiness / @rueian might have much more production experience with kuberay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can parse the value from an env variable. And the default 0 is fine, I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the default 0 is fine,
@rueian, would you mind explaining why defaulting to 0 is acceptable? To make sure we are on the same page,
https://pkg.go.dev/net/http#Server
// ReadTimeout is the maximum duration for reading the entire
// request, including the body. A zero or negative value means
// there will be no timeout.
I would prefer to default to a positive timeout to prevent the server from being blocked by a single request for an extended period. In addition, I prefer not to expose the environment variable until we determine that it is necessary.
@dentiny or @owenowenisme, would you mind helping me understand the meaning of ReadTimeout
and WriteTimeout
here? I have read https://pkg.go.dev/net/http#Server, but I'd like to know what 'request' and 'response' refer to in the context of the KubeRay API server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that it is the user's responsibility to decide the value they want, so I think it is fine for us to leave it as default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general idea is, we should always apply a limit on resource and operation; otherwise server could get overloaded by slow / malicious clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think 0 is ok for now, because it's current behavior (no timeout); I would like to reduce risk.
For me, getting stuck on a single request is much more expensive than having a request fail. KubeRay has had related issues complained by users regarding the requests between the KubeRay operator and the Ray dashboard. Hence, I still prefer to set a default timeout.
Set a long enough timeout, as I mentioned #3350 (comment)
SG.
Use flag instead of env to allow user configuration, because env (if used improperly) is shared among all sessions, and prune to security issue.
We can expose this configuration only if some users request it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the default value, could @owenowenisme investigate the size of requests/responses in different situations so that we can set a reasonable timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll work on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think we should provide a default timeout, but also give users an option to configure it.
Btw, @owenowenisme would you mind fixing the conflict? |
Why are these changes needed?
Related issue number
Closes #3306
Checks