-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NTH queue handler behaviour broken by PR #940 #990
Comments
Can @GavinBurris42 say if this change was intended or a mistake? |
Could someone from @aws/ec2-guacamole please respond to this. This issue has a significant cost overhead where nodes which should be terminated continue to run for the whole grace period (as grace period is cluster scoped this has to be the max permitted grace period). |
Hello @stevehipwell, apologies for the late response and thank you for noticing the broken NTH behavior. This issue is considered as priority and will take measures to mitigate this ASAP. Thank you. |
@LikithaVemulapalli I've opened PR #999 to solve this issue. It looks like the problem the original PR author was attempting to solve could be handled better by setting the |
Hello @stevehipwell, thanks for opening a PR to resolve this issue, yes our team had a discussion and identified the root cause with the post drain task that was modified as a part of latest changes. Appreciate the effort, will review the PR that you opened. Thank you. |
@LikithaVemulapalli the build has failed so the Helm chart hasn't been released for the NTH release. |
Hello @stevehipwell, yes the build failed adding windows binaries to assets, currently taking a look at that, we are aware of this issue, we are trying to fix it, thank you. |
Describe the bug
PR #940 removed the continue lifecycle action behaviour for ASGs which breaks the expected NTH behaviour; which is to signal to the ASG that the node can be deleted once it's drained.
If the PR author wanted to disable this functionality for their use case they should have provided configuration to do so, and only changed the default behaviour as part of a major version (if even wanted).
Steps to reproduce
Replace an ASG instance with instance refresh, the instance will not be terminated until the lifecycle hook times out.
Expected outcome
I'd expect the instance to be terminated by the ASG once NTH has completed draining it.
Application Logs
The log output when experiencing the issue.
Environment
The text was updated successfully, but these errors were encountered: