Description
Description
If ssh session in which warm-reboot was started drops, warm-reboot breaks on syncd stage, and the switch become inoperable.
This is degradation in comparison to the 201911 version.
Steps to reproduce the issue
- Verify that warm-reboot enable on the DUT:
show platfrom mlnx issue
ISSU is enabled
- Apply VLAN, IP, IPv6, and BGP configuration to the DUT(to be close to real production environment):
config interface ip add Loopback0 1.1.1.1/32
sonic-cfggen -j vlan.json --write-to-db <------RANDOM VLAN + IP + IPV6 configuration
config interface ip add Ethernet192 101.1.0.1/24
config interface ip add Ethernet192 2123::1/64
sonic-cfggen -j bgp_ipv4.json --write-to-db <----IPV4 BGP CONFIGURATION
sonic-cfggen -j bgp_ipv6.json --write-to-db <----IPV6 BGP CONFIGURATION
config save -y
config reload -y
- Input command "warm-reboot -v" from ssh session and disconnect ssh session immediately to simulate ssh session drop.
Describe the results you received
After the ssh session disconnect in rcon sessoin we can observe that warm-reboot stops on synd shutdown stage and system became unavailable till reboot.
Describe the results you expected
Warm-reboot should be resistant to ssh disconnects and not stops even in case of ssh session disconnection(ssh client disruption by any reason(server hangs up for example), and not lead to system crash.
For a data center, it is dangerous because it is impossible to confidently update the system
Output of show version
SONiC Software Version: SONiC.sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023
Distribution: Debian 10.8
Kernel: 4.19.0-12-2-amd64
Build commit: 5cb07fad
Build date: Fri Mar 19 12:59:10 UTC 2021
Built by: vadymh@r-build-sonic03
Platform: x86_64-mlnx_msn3800-r0
HwSKU: ACS-MSN3800
ASIC: mellanox
ASIC Count: 1
Serial Number: MT1937X00565
Uptime: 13:31:35 up 3 min, 1 user, load average: 1.08, 0.84, 0.37
Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-syncd-mlnx latest 41b21b570cb4 662MB
docker-syncd-mlnx sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 41b21b570cb4 662MB
docker-sflow latest daec01591301 409MB
docker-sflow sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 daec01591301 409MB
docker-snmp latest 8b1fd0019321 439MB
docker-snmp sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 8b1fd0019321 439MB
docker-dhcp-relay latest 6ccc99d7ad71 405MB
docker-dhcp-relay sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 6ccc99d7ad71 405MB
docker-teamd latest d99c06d2ed81 408MB
docker-teamd sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 d99c06d2ed81 408MB
docker-nat latest 422e6bfc371f 411MB
docker-nat sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 422e6bfc371f 411MB
docker-router-advertiser latest d97a9c6d2e9a 398MB
docker-router-advertiser sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 d97a9c6d2e9a 398MB
docker-platform-monitor latest c30aaf3e6de1 689MB
docker-platform-monitor sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 c30aaf3e6de1 689MB
docker-lldp latest ecf251673b0a 438MB
docker-lldp sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 ecf251673b0a 438MB
docker-database latest 3ee3d100394d 398MB
docker-database sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 3ee3d100394d 398MB
docker-sonic-mgmt-framework latest 4d5e159ef26e 617MB
docker-sonic-mgmt-framework sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 4d5e159ef26e 617MB
docker-orchagent latest cd97498e4604 427MB
docker-orchagent sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 cd97498e4604 427MB
docker-sonic-telemetry latest eadc019e1177 487MB
docker-sonic-telemetry sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 eadc019e1177 487MB
docker-fpm-frr latest cecb8f7c5f6d 426MB
docker-fpm-frr sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023 cecb8f7c5f6d 426MB
This is degradation in comparison to the 201911 release