You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The CPU Packet Debug Counters provides BRCM KNET driver level Rx and Tx statistics for each protocol type. SONiC KNET currently provides CPU packet counters based on CPU queues which indirectly provides Rx per protocol counters since each protocol is assigned a separate CPU queue. This feature enhances CPU packet statistics by adding per protocol counters for CPU Rx and Tx directions.
35
+
The CPU Packet Debug Counters provides BRCM KNET driver level Rx and Tx statistics for each protocol type at switch and interface levels. SONiC KNET currently provides CPU packet counters based on CPU queues which indirectly provides Rx per protocol counters since each protocol is assigned a separate CPU queue. This feature enhances CPU packet statistics by adding per protocol counters per interface for CPU Rx and Tx directions. Packet drop reason details per protocol/queue per interface are also added for more debug visibility.
35
36
36
37
## 1.1 Requirements
37
38
38
39
### 1.1.1 Functional Requirements
39
40
40
-
1. Support Tx and Rx per protocol KNET driver level packet counters from KNET procfs
41
-
2. Support Tx and Rx per protocol Linux/KNET dropped packet counters from KNET procfs
42
-
3. Support clearing of Tx and Rx counters
43
-
4.Output of statistics will be similar to the following format:
41
+
1. Support switch and per interface Tx and Rx per protocol KNET driver level packet counters
42
+
2. Support switch and per interface Tx and Rx per protocol Linux/KNET dropped packet counters
43
+
3. Support switch and per interface Tx and Rx KNET error reason detail counters
44
+
4.Support switch and per interface clearing of Tx and Rx counters
44
45
46
+
### 1.1.2 Configuration and Management Requirements
47
+
Statistics will be provided via click CLIs and KNET procfs files
48
+
49
+
### 1.1.3 Scaling and Performance Requirements
50
+
51
+
There should be minimal impact on CPU Pkt IO performance and latencies. CPU Pkt IO performance will be profiled to measure the impact of adding pkt type processing on CPU Pkt IO performance and latencies.
52
+
53
+
### 1.1.4 Warm Boot Requirements
54
+
Not applicable
55
+
56
+
# 2 Functionality
57
+
## 2.1 Target Deployment Use Cases
58
+
This debugging enhancement provides more granular debug counters and additional visibility into the CPU Pkt IO network driver (KNET) and Linux kernel pkt path that will help with debugging SONiC systems.
59
+
60
+
## 2.2 Functional Description
61
+
See feature overview
62
+
63
+
# 3 Design
64
+
## 3.1 Overview
65
+
Protocol classification logic will be added to the Broadcom KNET driver pkt Tx and Rx callbacks to identify protocols and enable pkt protocol type accounting per physical interface. The statistics output will be available via KNET driver procfs files and click CLIs for convenience.
66
+
67
+
### 3.1.1 Rx Classification and Packet drop counting
68
+
The Broadcom KNET driver filter infrastructure will be used to classify Rx packets (and CPU queues from pkt metadata) and tag the protocol type from added logic in the KNET callback module. This will allow managing KNET packet stats accounting to have per protocol and per interface information when managing stats accounting.
69
+
70
+
Any drops before KNET filter processing (matching src port) will not have src port information so pkt drops will count towards the global protocol switch counters and will not update the protocol interface counters. After KNET filter processing determines the src port, both global and per interface protocol counters will be updated.
71
+
72
+
### 3.1.2 Tx Classification
73
+
The Broadcom KNET driver will be modified to provide a similar KNET filter infrastructure as the Rx path to allow classification of pkt types as they are received by the KNET driver from the Linux kernel. The pkt skbuff will have destination port information from the Linux kernel so per interface protocol counters and any drop counters will be updated.
74
+
75
+
### 3.1.3 Protocol Classification Criteria
76
+
Packets will be parsed and classified according to the criteria in the following table. Any packets that cannot be classified will have counters updated in the "unknown" category.
| MTU | 4 | Not classified (Rx queue counters only)
118
+
| Sflow | 3 | Rx: Pkt sample metadata, Tx: (L4Port == 6343 for inband sflow datagram to collector)
119
+
| TTL | 0 | TTL == 0 or TTL == 1
120
+
121
+
### 3.1.4 Packet Drop Error types
122
+
Errors in KNET (and in Rx kernel path) which result in packet drops are counted per protocol/queue on each interface. Some common errors are listed below.
| LINK_DOWN | Kernel network interface was in down state
127
+
| NO_SKB | Packet SKB alloc failed
128
+
| NO_BUFFER | No DMA buffer resource available (congested PktIO)
129
+
| KERNEL_DROP | Rx packet drop in Linux kernel network stack
130
+
| NO_FILTER_MATCH | Rx packet did not match any KNET filters (no handlers)
131
+
| UNKN_NETIF | Unknown src network interface for Rx packet
132
+
| HW_RESET | Packet dropped due to HW reset cleanup in progress
133
+
134
+
## 3.2 DB Changes
135
+
There are no new changes to SONiC DBs.
136
+
137
+
## 3.3 Switch State Service Design
138
+
### 3.3.1 Orchestration Agent
139
+
There are no new changes to Orchagent.
140
+
## 3.4 SyncD
141
+
There are no new changes to SyncD.
142
+
143
+
## 3.5 Manageability
144
+
There are no new changes to manageability infrastructure
145
+
146
+
## 3.6 CLI
147
+
KNET Packet stats at switch and interface levels can be shown through click commands or by dumping KNET pkt_stats procfs file. Rx and Tx error counter details are available for protocol and rx queue counters at switch and interface levels.
148
+
149
+
### 3.6.1 Protocol Stats
150
+
* Show protocol stats (real output will not show entries with zero counts)
The new debug counters introduced at switch and interface levels (with drop details) will be added to techsupport collection with 2 samples each
311
+
312
+
# 7 Warm Boot Support
313
+
No warmboot support. CPU Pkt IO counters are maintained in KNET kernel network driver which gets cleared on CPU reset on a warmboot operation.
314
+
315
+
# 8 Scalability
316
+
## 8.1 Profiling and Tuning
317
+
The CPU packet IO performance impact on throughput and latency with the addition of processing to identify pkt type for counters will be profiled. It is expected there will be minimal impact.
318
+
319
+
# 9 Unit Test
320
+
## 9.1 Functional Test Cases
321
+
### 9.1.1 Protocol Stats
322
+
1. Verify Rx and Tx pkt counters on switch/interface levels for each protocol
323
+
2. Verify KERNEL_DROP Rx drop counter on switch/interface levels for selected protocols
324
+
3. Verify LINK_DOWN Rx and Tx drop counters on switch/interface levels for selected protocols
325
+
4. Verify NO_BUFFER Tx drop counters on switch/interface levels for selected protocols
326
+
5. Verify clear stats for switch and interface level protocol counters
327
+
328
+
### 9.1.2 Rx Queue Stats
329
+
1. Verify Rx queues and description match COPP config
330
+
2. Verify Rx queue pkt counters on switch/interface levels for each protocol
331
+
3. Verify KERNEL_DROP Rx queue drop counter on switch/interface levels for selected protocols
332
+
4. Verify LINK_DOWN Rx queue drop counters on switch/interface levels for selected protocols
333
+
5. Verify clear stats for switch and interface level Rx queue counters
334
+
335
+
336
+
## 9.2 Warm Boot Test Cases
337
+
Not applicable
338
+
339
+
## 9.3 Negative Test Cases
340
+
341
+
# 10 Internal Design Information
342
+
## 10.1 Guidance on simulating triggers for testing packet drops
343
+
### 10.1.1 LINK_DOWN pkt drops
344
+
* The following command can directly set netdevice link status down while physical link is up. Packets will continue to be punted up to CPU but will drop with LINK_DOWN reason. CPU Tx direction should also show drops
345
+
```
346
+
DUT# echo "Ethernet0=down" > /proc/bcm/knet/link
347
+
DUT# cat /proc/bcm/knet/link
348
+
Software link status:
349
+
Ethernet0 down
350
+
```
351
+
### 10.1.2 KERNEL_DROP Rx pkt drops
352
+
* Send LACP packets to the DUT without the src port configured on a portchannel. The Linux kernel will drop the packet since the src port has not been registered as a portchannel member in the teamd kernel module (portchannel kernel component).
353
+
354
+
### 10.1.3 NO_BUFFER Tx pkt drops
355
+
* Decrease the KNET Tx buffers to a low value (8 or 4) to increase chance of congestion then send flood ping to congest the CPU Tx path and induce out of DMA buffer resource Tx drops. This operation should be used for testing only with all interfaces down. Changing the max number of Tx buffers may cause a crash if packets are being processed in KNET.
356
+
```
357
+
echo "max_tx_dcbs=4" > /proc/bcm/knet/dma
358
+
```
359
+
* Then check the Tx high watermark stats after sending CPU Tx burst.
360
+
```
361
+
DUT# cat /proc/bcm/knet/dstats | grep "Tx used DCBs hi wm"
0 commit comments