-
Notifications
You must be signed in to change notification settings - Fork 175
Chassisd does not wait for the execution to complete for previous admin state change requests. Fixed issue #22138 #607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Chassisd does not wait for the execution to complete for previous admin state change requests. Fixed issue #22138 #607
Conversation
…te for previous admin state change requests.
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@@ -740,6 +749,9 @@ class SmartSwitchModuleUpdater(ModuleUpdater): | |||
# Persist dpu down time | |||
self.persist_dpu_reboot_time(key) | |||
|
|||
# Clear transition flag in STATE_DB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, but but do we need to wait until the operational state transition has occurred?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gpunathilell @vvolam This is the instance where the operation state changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead, we can just wait for the admin_state change function to return to clear the transition flag right? Assuming the set_admin_state function failed, we can immediately clear the transition flag and do other operations (instead of waiting for timeout)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gpunathilell We can, but we need to maintain a list for all DPUs and it makes the code unnecessarily complicated. The possibility of coming out of timeout is very rare and again it is safe to wait until the timeout happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant in line 240 in chassisd, we can just add it here right?
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Chassisd does not wait for the execution to complete for previous admin state change requests. Fixed issue #22138
Description
Fixed issue #22138
Chassisd does not wait for the execution to complete for previous admin state change requests. Added support to solve this.
Fixed issue #22138
Motivation and Context
Added a state to indicate the progress of an admin_state transition period (window) of every module in config_db.
When in progress do not allow another admin_state change for that module.
Mark the beginning of state transition in the CLI itself. Added a timeout also for error cases.
Mark the end of state transition in chassisd
How Has This Been Tested?
ssue "config chassis module startup DPU0". This takes around 3 mins. Check the config_db and verify the new variable is set to True.
Issue another "config chassis module startup DPU0" and also ""config chassis module shutdown DPU0".
See both are blocked.
Once the DPU operational state is the admin_sate check if a new config is accepted.
Additional Information (Optional)
Goes with PR: sonic-net/sonic-utilities#3845