Closed
Description
Description
If two config load_minigraph
overlaps, the second config load_minigraph
might fails to restart the mux container if it is started by the first config load_minigraph
but still in activating
state:
The packet sockets open by linkmgrd will be invalidated, any I/O will have ENXIO error:
admin@str2-7050cx3-acs-06:~$ cat /proc/net/packet
sk RefCnt Type Proto Iface R Rmem User Inode
00000000e387b4b8 2 3 0003 -1 0 0 0 3905243
000000004de5cd30 2 3 0003 -1 0 0 0 3905245
0000000025b13c8e 2 3 0003 -1 0 0 0 3906908
00000000eb4e03d3 2 3 0003 -1 0 0 0 3905247
0000000040e9cf12 2 3 0003 -1 0 0 0 3906910
00000000b50da572 2 3 0003 -1 0 0 0 3905250
00000000cdd99133 2 3 0003 -1 0 0 0 3905251
00000000610eddd1 2 3 0003 -1 0 0 0 3906953
00000000bcca7a88 2 3 0003 -1 0 0 0 3905256
000000006cdb1e05 2 3 0003 -1 0 0 0 3905325
000000008cfcecea 2 3 0003 -1 0 0 0 3905326
000000007440716d 2 3 0003 -1 0 0 0 3905811
00000000656aa72e 2 3 0003 -1 0 0 0 3906984
0000000001fd475e 2 3 0003 -1 0 0 0 3905339
0000000013268c02 2 3 0003 -1 0 0 0 3905341
00000000aad1e9f6 2 3 0003 -1 0 0 0 3906986
00000000af21576f 2 3 0003 -1 0 0 0 3906988
00000000634b790f 2 3 0003 -1 0 0 0 3907626
00000000467c795f 2 3 0003 -1 0 0 0 3905360
000000005dd925d8 2 3 0003 -1 0 0 0 3907009
00000000a543c7b6 2 3 0003 -1 0 0 0 3907011
0000000061394386 2 3 0003 -1 0 0 0 3905945
0000000083b64f36 2 3 0003 -1 0 0 0 3907041
00000000a98e1f3f 2 3 0003 -1 0 0 0 3907071
Steps to reproduce the issue:
This can be reproduced by:
- Let mux service sleep for 30s during ExecStartPre phase.
admin@lab-dev:~$ sudo systemctl cat mux.service
# /etc/systemd/system/mux.service
[Unit]
Description=MUX Cable Container
Requires=database.service updategraph.service swss.service
After=swss.service interfaces-config.service
BindsTo=sonic.target
After=sonic.target
[Service]
ExecStartPre=/usr/local/bin/write_standby.py -r
ExecStartPre=/usr/local/bin/mark_dhcp_packet.py
ExecStartPre=/usr/bin/mux.sh start
ExecStartPre=/usr/bin/sleep 30
ExecStart=/usr/bin/mux.sh wait
ExecStop=/usr/bin/mux.sh stop
ExecStopPost=/usr/local/bin/write_standby.py --shutdown mux
- Restart mux.service, ensure mux.service is stuck in “activating” status due to the sleep.
admin@lab-dev:~$ sudo systemctl restart mux.service --no-block
admin@lab-dev:~$ systemctl status mux.service
● mux.service - MUX Cable Container
Loaded: loaded (/etc/systemd/system/mux.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/mux.service.d
└─auto_restart.conf
Active: activating (start-pre) since Tue 2024-08-06 03:10:30 UTC; 6s ago
Process: 586122 ExecStartPre=/usr/local/bin/write_standby.py -r (code=exited, status=0/SUCCESS)
Process: 586216 ExecStartPre=/usr/local/bin/mark_dhcp_packet.py (code=exited, status=0/SUCCESS)
Process: 586386 ExecStartPre=/usr/bin/mux.sh start (code=exited, status=0/SUCCESS)
Cntrl PID: 586439 (sleep)
Tasks: 1 (limit: 19126)
Memory: 152.0K
CGroup: /system.slice/mux.service
└─586439 /usr/bin/sleep 30
- Restart sonic.target, validate the packet sockets of linkmgrd becomes invalidated (Iface is -1).
admin@str2-7050cx3-acs-06:~$ sudo systemctl restart sonic.target
admin@str2-7050cx3-acs-06:~$ cat /proc/net/packet
sk RefCnt Type Proto Iface R Rmem User Inode
00000000e387b4b8 2 3 0003 -1 0 0 0 3905243
000000004de5cd30 2 3 0003 -1 0 0 0 3905245
0000000025b13c8e 2 3 0003 -1 0 0 0 3906908
00000000eb4e03d3 2 3 0003 -1 0 0 0 3905247
0000000040e9cf12 2 3 0003 -1 0 0 0 3906910
00000000b50da572 2 3 0003 -1 0 0 0 3905250
00000000cdd99133 2 3 0003 -1 0 0 0 3905251
00000000610eddd1 2 3 0003 -1 0 0 0 3906953
00000000bcca7a88 2 3 0003 -1 0 0 0 3905256
000000006cdb1e05 2 3 0003 -1 0 0 0 3905325
000000008cfcecea 2 3 0003 -1 0 0 0 3905326
000000007440716d 2 3 0003 -1 0 0 0 3905811
00000000656aa72e 2 3 0003 -1 0 0 0 3906984
0000000001fd475e 2 3 0003 -1 0 0 0 3905339
0000000013268c02 2 3 0003 -1 0 0 0 3905341
00000000aad1e9f6 2 3 0003 -1 0 0 0 3906986
00000000af21576f 2 3 0003 -1 0 0 0 3906988
00000000634b790f 2 3 0003 -1 0 0 0 3907626
00000000467c795f 2 3 0003 -1 0 0 0 3905360
000000005dd925d8 2 3 0003 -1 0 0 0 3907009
00000000a543c7b6 2 3 0003 -1 0 0 0 3907011
0000000061394386 2 3 0003 -1 0 0 0 3905945
0000000083b64f36 2 3 0003 -1 0 0 0 3907041
00000000a98e1f3f 2 3 0003 -1 0 0 0 3907071
Describe the results you received:
Describe the results you expected:
Output of show version
:
(paste your output here)
Output of show techsupport
:
(paste your output here or download and attach the file here )