Skip to content

[warm-reboot] warm-reboot breaks if ssh session in which it was started drops #7127

Closed
sonic-net/sonic-utilities
#1529
@Hedgehog-Guru

Description

@Hedgehog-Guru

Description
If ssh session in which warm-reboot was started drops, warm-reboot breaks on syncd stage, and the switch become inoperable.

This is degradation in comparison to the 201911 version.

Steps to reproduce the issue

  1. Verify that warm-reboot enable on the DUT:
show platfrom mlnx issue
ISSU is enabled
  1. Apply VLAN, IP, IPv6, and BGP configuration to the DUT(to be close to real production environment):
config interface ip add Loopback0 1.1.1.1/32
sonic-cfggen -j vlan.json --write-to-db       <------RANDOM VLAN + IP + IPV6 configuration
config interface ip add Ethernet192 101.1.0.1/24
config interface ip add Ethernet192 2123::1/64
sonic-cfggen -j bgp_ipv4.json --write-to-db   <----IPV4 BGP CONFIGURATION
sonic-cfggen -j bgp_ipv6.json --write-to-db   <----IPV6 BGP CONFIGURATION

config save -y
config reload  -y
  1. Input command "warm-reboot -v" from ssh session and disconnect ssh session immediately to simulate ssh session drop.

Describe the results you received
After the ssh session disconnect in rcon sessoin we can observe that warm-reboot stops on synd shutdown stage and system became unavailable till reboot.

Describe the results you expected
Warm-reboot should be resistant to ssh disconnects and not stops even in case of ssh session disconnection(ssh client disruption by any reason(server hangs up for example), and not lead to system crash.
For a data center, it is dangerous because it is impossible to confidently update the system

Output of show version

SONiC Software Version: SONiC.sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023
Distribution: Debian 10.8
Kernel: 4.19.0-12-2-amd64
Build commit: 5cb07fad
Build date: Fri Mar 19 12:59:10 UTC 2021
Built by: vadymh@r-build-sonic03

Platform: x86_64-mlnx_msn3800-r0
HwSKU: ACS-MSN3800
ASIC: mellanox
ASIC Count: 1
Serial Number: MT1937X00565
Uptime: 13:31:35 up 3 min,  1 user,  load average: 1.08, 0.84, 0.37

Docker images:
REPOSITORY                    TAG                                                   IMAGE ID            SIZE
docker-syncd-mlnx             latest                                                41b21b570cb4        662MB
docker-syncd-mlnx             sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   41b21b570cb4        662MB
docker-sflow                  latest                                                daec01591301        409MB
docker-sflow                  sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   daec01591301        409MB
docker-snmp                   latest                                                8b1fd0019321        439MB
docker-snmp                   sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   8b1fd0019321        439MB
docker-dhcp-relay             latest                                                6ccc99d7ad71        405MB
docker-dhcp-relay             sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   6ccc99d7ad71        405MB
docker-teamd                  latest                                                d99c06d2ed81        408MB
docker-teamd                  sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   d99c06d2ed81        408MB
docker-nat                    latest                                                422e6bfc371f        411MB
docker-nat                    sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   422e6bfc371f        411MB
docker-router-advertiser      latest                                                d97a9c6d2e9a        398MB
docker-router-advertiser      sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   d97a9c6d2e9a        398MB
docker-platform-monitor       latest                                                c30aaf3e6de1        689MB
docker-platform-monitor       sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   c30aaf3e6de1        689MB
docker-lldp                   latest                                                ecf251673b0a        438MB
docker-lldp                   sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   ecf251673b0a        438MB
docker-database               latest                                                3ee3d100394d        398MB
docker-database               sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   3ee3d100394d        398MB
docker-sonic-mgmt-framework   latest                                                4d5e159ef26e        617MB
docker-sonic-mgmt-framework   sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   4d5e159ef26e        617MB
docker-orchagent              latest                                                cd97498e4604        427MB
docker-orchagent              sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   cd97498e4604        427MB
docker-sonic-telemetry        latest                                                eadc019e1177        487MB
docker-sonic-telemetry        sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   eadc019e1177        487MB
docker-fpm-frr                latest                                                cecb8f7c5f6d        426MB
docker-fpm-frr                sonic_build_358_nbrmgrd_fix.0-dirty-20210319.125023   cecb8f7c5f6d        426MB

This is degradation in comparison to the 201911 release

sonic_dump_r-qa-sw-eth-2322_20210322_135917.tar.gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    Triagedthis issue has been triaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions