-
Notifications
You must be signed in to change notification settings - Fork 718
CLI support for SmartSwitch PMON #3271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 180 commits
11cc04d
02df0ea
e0e4700
0a8fc5a
6d61faa
8d95dae
5c1b666
fe4a8cf
f896438
9d6c093
0904515
ccc380b
a8fa81d
1cf96a0
64fd559
d202e1c
b8c92ae
9986f7b
0dc52f6
93df26d
7a2aaf4
26f9b8a
3a592f8
71472a8
fd8bd6b
5b15bc4
b35c987
ee10649
27546a6
883e35c
713ffa2
62fc3d0
ecb2ecc
e2eb660
ef87cb5
53c2277
fb989e4
8ea7960
76de68a
a08e0cb
766b303
c474940
1910163
851dc78
cb54b73
4dfb5f8
6941baf
f3c8e36
6d7d539
433bc50
3ddcc9c
95da5c0
627dd5e
934e6ef
64d06ec
4870a86
fed3f67
68b6416
d229307
78e71c5
d7fbe9d
313a9d2
0ea1227
f5f88bb
62817ea
9fb005d
7c8c5d7
b5b068b
25259cb
808e7b4
7eb8304
44bed5c
d7fd0ce
b0e51f8
ed742fc
11f48f3
402887d
8db11f3
2ab48b5
e843fff
9ba21d2
738634d
c491687
ee3f927
d47a431
04c520e
c5abc01
6ab7742
4299ac3
d30ead7
a07e8c0
a2cece6
e2b65af
53909f0
9849436
02152e3
a75a4d3
f8a1f57
29000c3
e273a16
d720cf6
c6040b3
864c96c
8580f76
f4942b7
3e44844
e7355b0
b132f90
781270a
2e8813b
6cba5ed
5db0bc2
807529f
885b168
b6efa8c
4c26a25
ed3d24b
68a9efe
d09d58f
c217c18
df87438
2dfc2b5
c261b0c
ab200bc
5e36792
4a43780
8b2c9cb
155ba3f
e8c8b42
9601177
9713bf7
fdf8569
4b30138
e725add
d2e7590
3e1fc12
51dce03
a016ead
041fad6
5c85cf4
1b3fabb
9a0225b
8f191d6
523a42c
a90b878
594a9dc
79666d1
9bb29e3
63d5f9f
1255ee6
d630304
933c04e
5a4c7fd
989fa80
00df371
48c8419
be8d747
0764a34
54cfbab
00c0ee0
b43f72b
ec47fa2
8432ed8
df2517b
d30b4fb
3274de0
c53685f
2b77e74
d46bf3a
513f21d
8da07e1
6796e67
e89daf7
2ccb4c3
d0f02f7
8b86eee
4ed816f
ea76bf3
3c3a500
3837515
9d94a9a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,6 +5,7 @@ | |
import re | ||
import subprocess | ||
import utilities_common.cli as clicommon | ||
from utilities_common.chassis import is_smartswitch, get_all_dpus | ||
|
||
TIMEOUT_SECS = 10 | ||
|
||
|
@@ -27,7 +28,10 @@ def get_config_module_state(db, chassis_module_name): | |
config_db = db.cfgdb | ||
fvs = config_db.get_entry('CHASSIS_MODULE', chassis_module_name) | ||
if not fvs: | ||
return 'up' | ||
if is_smartswitch(): | ||
return 'down' | ||
else: | ||
return 'up' | ||
else: | ||
return fvs['admin_status'] | ||
|
||
|
@@ -102,16 +106,21 @@ def fabric_module_set_admin_status(db, chassis_module_name, state): | |
# | ||
@modules.command('shutdown') | ||
@clicommon.pass_db | ||
@click.argument('chassis_module_name', metavar='<module_name>', required=True) | ||
@click.argument('chassis_module_name', | ||
metavar='<module_name>', | ||
required=True, | ||
type=click.Choice(get_all_dpus(), case_sensitive=False) if is_smartswitch() else str | ||
) | ||
def shutdown_chassis_module(db, chassis_module_name): | ||
"""Chassis-module shutdown of module""" | ||
config_db = db.cfgdb | ||
ctx = click.get_current_context() | ||
|
||
if not chassis_module_name.startswith("SUPERVISOR") and \ | ||
not chassis_module_name.startswith("LINE-CARD") and \ | ||
not chassis_module_name.startswith("FABRIC-CARD"): | ||
ctx.fail("'module_name' has to begin with 'SUPERVISOR', 'LINE-CARD' or 'FABRIC-CARD'") | ||
not chassis_module_name.startswith("FABRIC-CARD") and \ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to perform additional validation to check if the chassis_module_name is actually present (or is an actual valid module name) or not, if user executes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gpunathilell Added |
||
not chassis_module_name.startswith("DPU"): | ||
rameshraghupathy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
ctx.fail("'module_name' has to begin with 'SUPERVISOR', 'LINE-CARD', 'FABRIC-CARD', 'DPU'") | ||
|
||
# To avoid duplicate operation | ||
if get_config_module_state(db, chassis_module_name) == 'down': | ||
|
@@ -130,7 +139,11 @@ def shutdown_chassis_module(db, chassis_module_name): | |
# | ||
@modules.command('startup') | ||
@clicommon.pass_db | ||
@click.argument('chassis_module_name', metavar='<module_name>', required=True) | ||
@click.argument('chassis_module_name', | ||
metavar='<module_name>', | ||
required=True, | ||
type=click.Choice(get_all_dpus(), case_sensitive=False) if is_smartswitch() else str | ||
) | ||
def startup_chassis_module(db, chassis_module_name): | ||
"""Chassis-module startup of module""" | ||
config_db = db.cfgdb | ||
|
@@ -142,7 +155,12 @@ def startup_chassis_module(db, chassis_module_name): | |
return | ||
|
||
click.echo("Starting up chassis module {}".format(chassis_module_name)) | ||
config_db.set_entry('CHASSIS_MODULE', chassis_module_name, None) | ||
if is_smartswitch(): | ||
fvs = {'admin_status': 'up'} | ||
config_db.set_entry('CHASSIS_MODULE', chassis_module_name, fvs) | ||
else: | ||
config_db.set_entry('CHASSIS_MODULE', chassis_module_name, None) | ||
|
||
if chassis_module_name.startswith("FABRIC-CARD"): | ||
if not check_config_module_state_with_timeout(ctx, db, chassis_module_name, 'up'): | ||
fabric_module_set_admin_status(db, chassis_module_name, 'up') |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rameshraghupathy in all these sample examples can you specify where the CLI is being executed. NPU or in the DPU host. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @prgeor Added samples, in missing places |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -713,6 +713,32 @@ This command displays the cause of the previous reboot | |
User issued reboot command [User: admin, Time: Mon Mar 25 01:02:03 UTC 2019] | ||
``` | ||
|
||
``` | ||
Note: The CLI extensions shown in this block are applicable only to smartswitch platforms. When these extensions are used on a regular switch the extension will be ignored and the output will be the same irrespective of the options. | ||
|
||
CLI Extensions Applicable to Smartswtich | ||
- show reboot-cause all | ||
- show reboot-cause history all | ||
- show reboot-cause history DPUx | ||
``` | ||
**show reboot-cause all** | ||
|
||
This command displays the cause of the previous reboot for the Switch and the DPUs for which the midplane interfaces are up. | ||
|
||
- Usage: | ||
``` | ||
show reboot-cause all | ||
``` | ||
|
||
- Example: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rameshraghupathy please also capture the CLI output specifically when run inside DPU (even though its same as fixed chassis) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @prgeor Added the CLI output when run on DPU |
||
``` | ||
root@MtFuji:~$ show reboot-cause all | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rameshraghupathy This will list the DPUs that are in admin shut? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No. This will show the "show reboot-cause" output of the DPUs that are admin UP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rameshraghupathy is this still true? With new design change, NPU can still find the last reboot cause of the DPU irrespective of its ADMIN state. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rameshraghupathy could you address Prince comment? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
Device Name Cause Time User | ||
-------- ------------------- ---------- ------ ------ | ||
NPU 2024_07_24_20_43_22 Power Loss N/A N/A | ||
DPU2 2024_07_24_20_43_22 Software causes (Reboot) N/A N/A | ||
DPU1 2024_07_24_20_43_22 Software causes (Reboot) N/A N/A | ||
``` | ||
**show reboot-cause history** | ||
|
||
This command displays the history of the previous reboots up to 10 entry | ||
|
@@ -733,6 +759,42 @@ This command displays the history of the previous reboots up to 10 entry | |
2020_10_09_04_53_58 warm-reboot Fri Oct 9 04:51:47 UTC 2020 admin | ||
``` | ||
|
||
**show reboot-cause history all** | ||
|
||
This command displays the history of the previous reboots up to 10 entry of the Switch and the DPUs for which the midplane interfaces are up. | ||
|
||
- Usage: | ||
``` | ||
show reboot-cause history all | ||
``` | ||
|
||
- Example: | ||
``` | ||
root@MtFuji:~# show reboot-cause history all | ||
Device Name Cause Time User Comment | ||
-------- ------------------- ----------------------------------------- ------------------------------- ------ ------- | ||
NPU 2024_07_23_23_06_57 Kernel Panic Tue Jul 23 11:02:27 PM UTC 2024 N/A N/A | ||
prgeor marked this conversation as resolved.
Show resolved
Hide resolved
|
||
NPU 2024_07_23_11_21_32 Power Loss N/A N/A Unknown | ||
``` | ||
|
||
**show reboot-cause history DPU1** | ||
|
||
This command displays the history of the previous reboots up to 10 entry of DPU1. If DPU1 is powered down then there won't be any data in the DB and the "show reboot-cause history DPU1" output will be blank. | ||
|
||
- Usage: | ||
``` | ||
show reboot-cause history DPU1 | ||
vvolam marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
|
||
- Example: | ||
``` | ||
root@MtFuji:~# show reboot-cause history DPU1 | ||
Device Name Cause Time User Comment | ||
-------- ------ ----------------------------------------- ------ ------ --------- | ||
DPU1 DPU1 Software causes (Hardware watchdog reset) N/A N/A N/A | ||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we add sample outputs for system-health as well? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gpunathilell Done |
||
|
||
|
||
**show uptime** | ||
|
||
This command displays the current system uptime | ||
|
@@ -11165,6 +11227,36 @@ In addition, displays a list of all current 'Services' and 'Hardware' being moni | |
psu.voltage Ignored Device | ||
``` | ||
|
||
**show system-health dpu <option>** | ||
|
||
This is a smartswitch specific cli. This cli shows the midplane, control plane and data plane health of the DPU modules in the smartswitch. | ||
|
||
This can take two forms of "<option>" 1. DPU module name (ex: DPU0) 2. all, which will list all the DPUs in the smartswitch | ||
|
||
- Usage: | ||
``` | ||
show system-health dpu DPU0 | ||
``` | ||
|
||
- Example: | ||
``` | ||
root@MtFuji-dut:/home/cisco# show system-health dpu DPU0 | ||
Name Oper-Status State-Detail State-Value Time Reason | ||
------ ------------- ----------------------- ------------- ------------------------------- ------------------------------------------------------------------------------------ | ||
DPU0 Online dpu_midplane_link_state up Mon Dec 23 05:12:17 PM UTC 2024 | ||
dpu_control_plane_state up Mon Dec 23 05:12:17 PM UTC 2024 All containers are up and running, host-ethlink-status: Uplink1/1 is UP | ||
dpu_data_plane_state up Mon Dec 23 05:12:17 PM UTC 2024 DPU container named polaris is running, pdsagent running : OK, pciemgrd running : OK | ||
|
||
root@MtFuji-dut:/home/cisco# show system-health dpu all | ||
Name Oper-Status State-Detail State-Value Time Reason | ||
------ ------------- ----------------------- ------------- ------------------------------- ------------------------------------------------------------------------------------ | ||
DPU0 Online dpu_midplane_link_state up Mon Dec 23 05:12:17 PM UTC 2024 | ||
dpu_control_plane_state up Mon Dec 23 05:12:17 PM UTC 2024 All containers are up and running, host-ethlink-status: Uplink1/1 is UP | ||
dpu_data_plane_state up Mon Dec 23 05:12:17 PM UTC 2024 DPU container named polaris is running, pdsagent running : OK, pciemgrd running : OK | ||
DPU1 Online dpu_midplane_link_state up Mon Dec 23 05:12:17 PM UTC 2024 | ||
dpu_control_plane_state up Mon Dec 23 05:12:17 PM UTC 2024 All containers are up and running, host-ethlink-status: Uplink1/1 is UP | ||
dpu_data_plane_state up Mon Dec 23 05:12:17 PM UTC 2024 DPU container named polaris is running, pdsagent running : OK, pciemgrd running : OK | ||
|
||
Go Back To [Beginning of the document](#) or [Beginning of this section](#System-Health) | ||
|
||
## VLAN & FDB | ||
|
Uh oh!
There was an error while loading. Please reload this page.