-
Notifications
You must be signed in to change notification settings - Fork 1.5k
[Fastboot] Delay LLDP service for better fastboot performance #10568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fastboot] Delay LLDP service for better fastboot performance #10568
Conversation
Signed-off-by: Shlomi Bitton <[email protected]>
[Timer] | ||
OnUnitActiveSec=0 sec | ||
OnBootSec=1min 30 sec | ||
Unit=lldp.service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lldp.service is a per-namespace container. Will that work for multi-asic? Should we have several timers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add has_timer to this file https://github.com/Azure/sonic-buildimage/blob/master/files/build_templates/init_cfg.json.j2#L32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I think this change will impact all kinds of reboot - warm/fast/cold/etc. Do you agree? If yes, is that what we want to do? Can we do this only for fast and warmboot case? |
@vaibhavhd I agree, I added a condition for it on syncd.sh like we do for pmon today. |
files/scripts/syncd.sh
Outdated
if [[ x"$sonic_asic_platform" == x"mellanox" ]]; then | ||
debug "Starting pmon service..." | ||
/bin/systemctl start pmon | ||
debug "Started pmon service" | ||
fi | ||
if [[ x"$BOOT_TYPE" != x"cold" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: This is correct, but it may be better to match against whitelist BOOT_TYPEs (fast, warm, fastfast).
This is so that this condition does not evaluate to true for any other boottype that gets added in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -0,0 +1,12 @@ | |||
[Unit] | |||
# This delay is for fast-reboot performance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correction - fast-reboot and warm-reboot performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments pending, LGTM otherwise. Please address Stepan's concerns before merging.
… for multi asic service timer. Change the timer file to timer template file. Allign init_cfg.json.j2 to delay LLDP. Fix review comments.
@vaibhavhd @stepanblyschak I fixed your comments, can you please review and approve? |
@vaibhavhd @stepanblyschak kindly reminder to review |
- Why I did it Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time. This parallel execution consume CPU time and the duration of create_switch is longer than it should be. Following this finding, and the motivation to ensure these services will not interfere in the future, LLDP is delayed in 90 seconds until the system finish the init flow after fastboot. - How I did it Add a timer for LLDP service. Copy the timer file to the host bin image. - How to verify it Run fast-reboot on MLNX platform and observe faster create_switch execution time. This PR is dependent on PR: #10567
This commit could not be cleanly cherry-picked to 202012. Please submit another PR. |
@qiluo-msft I created a separate PR for 202012: |
#10568) (#10744) This PR is to backport a fix #10568 This PR is dependent on PR: #10745 - Why I did it Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time. This parallel execution consume CPU time and the duration of create_switch is longer than it should be. Following this finding, and the motivation to ensure these services will not interfere in the future, LLDP is delayed in 90 seconds until the system finish the init flow after fastboot. - How I did it Add a timer for LLDP service. Copy the timer file to the host bin image. - How to verify it Run fast-reboot on MLNX platform and observe faster create_switch execution time.
…net#10568) - Why I did it Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time. This parallel execution consume CPU time and the duration of create_switch is longer than it should be. Following this finding, and the motivation to ensure these services will not interfere in the future, LLDP is delayed in 90 seconds until the system finish the init flow after fastboot. - How I did it Add a timer for LLDP service. Copy the timer file to the host bin image. - How to verify it Run fast-reboot on MLNX platform and observe faster create_switch execution time. This PR is dependent on PR: sonic-net#10567
Signed-off-by: Shlomi Bitton [email protected]
Why I did it
Profiling the system state on init after fast-reboot during create_switch function execution, it is possible to see few python scripts running at the same time.
This parallel execution consume CPU time and the duration of create_switch is longer than it should be.
Following this finding, and the motivation to ensure these services will not interfere in the future, LLDP is delayed in 90 seconds until the system finish the init flow after fastboot.
How I did it
How to verify it
Run fast-reboot on MLNX platform and observe faster create_switch execution time.
This PR is dependent on PR: #10567
Which release branch to backport (provide reason below if selected)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)