Description
Description
During scale testing, all MACs are not synced to kernel. This can be easily reproduced by learning 10K MACs on a port and doing a shutdown/no shut. Could this be due to fdbsyncd not checking if port is up in kernel before programming the mac?
The second scenario where the problem happens is when MAC ages out in kernel but not in switch (if FDB aging time is increased in switch). This results in fdbsyncd processing stale mac notifications and reprogramming them based on status in state_db. In this scenario it is always observed that exactly 8K macs are reprogrammed (could be size of netlink buffer queue?)
The above two scenarios will result in many MACs not synced to remote VTEPS in EVPN.
Steps to reproduce the issue:
- Learn 10 K MACs in a port
- Shutdown the interface
- Startup the interface.
Describe the results you received:
Describe the results you expected:
Output of show version
:
SONiC Software Version: SONiC.202205.42-ea51d9514_Internal
Distribution: Debian 11.5
Kernel: 5.10.0-12-2-amd64
Build commit: ea51d9514
Build date: Fri Oct 7 05:45:56 UTC 2022
Built by: sw-r2d2-bot@r-build-sonic-ci03-243
Platform: x86_64-mlnx_msn2700-r0
HwSKU: Mellanox-SN2700
ASIC: mellanox
ASIC Count: 1
Serial Number: MT1829X20804
Model Number: MSN2700-CB2F
Hardware Revision: A2
Uptime: 07:07:08 up 11:30, 3 users, load average: 2.17, 2.46, 2.58
Date: Wed 26 Oct 2022 07:07:08
Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-syncd-mlnx 202205.42-ea51d9514_Internal 97dd12cebe1e 859MB
docker-syncd-mlnx latest 97dd12cebe1e 859MB
docker-orchagent 202205.42-ea51d9514_Internal 347f0cdc723f 478MB
docker-orchagent latest 347f0cdc723f 478MB
docker-fpm-frr 202205.42-ea51d9514_Internal ddadceae2d69 488MB
docker-fpm-frr latest ddadceae2d69 488MB
docker-teamd 202205.42-ea51d9514_Internal 28f79f968d3c 459MB
docker-teamd latest 28f79f968d3c 459MB
docker-platform-monitor 202205.42-ea51d9514_Internal 629c9ea03cf2 861MB
docker-platform-monitor latest 629c9ea03cf2 861MB
docker-macsec latest a7ea8b95281f 461MB
docker-snmp 202205.42-ea51d9514_Internal 0e96a62d07ee 488MB
docker-snmp latest 0e96a62d07ee 488MB
docker-dhcp-relay latest 8cef09a39edf 452MB
docker-lldp 202205.42-ea51d9514_Internal 337146c6b971 485MB
docker-lldp latest 337146c6b971 485MB
docker-mux 202205.42-ea51d9514_Internal 464339799d55 492MB
docker-mux latest 464339799d55 492MB
docker-sonic-telemetry 202205.42-ea51d9514_Internal 7fc604d28c7c 523MB
docker-sonic-telemetry latest 7fc604d28c7c 523MB
docker-database 202205.42-ea51d9514_Internal 98a7bdcfd7e8 443MB
docker-database latest 98a7bdcfd7e8 443MB
docker-router-advertiser 202205.42-ea51d9514_Internal f05c810acb38 443MB
docker-router-advertiser latest f05c810acb38 443MB
docker-nat 202205.42-ea51d9514_Internal 272fda2cdf1a 430MB
docker-nat latest 272fda2cdf1a 430MB
docker-sflow 202205.42-ea51d9514_Internal 5723c8d63918 428MB
docker-sflow latest 5723c8d63918 428MB
docker-sonic-mgmt-framework 202205.42-ea51d9514_Internal 0fd3a3d91b98 557MB
docker-sonic-mgmt-framework latest 0fd3a3d91b98 557MB
urm.nvidia.com/sw-nbu-sws-sonic-docker/sonic-wjh 1.3.1-202205 4e8b9199b984 643MB
Output of show techsupport
:
(paste your output here or download and attach the file here )
Issue 1 - sonic_dump_qa-eth-vt05-2-2700a1_20221026_070014
Issue 2 - sonic_dump_qa-eth-vt03-2-3700v_20221023_204313
Additional information you deem important (e.g. issue happens only occasionally):
sonic_dump_qa-eth-vt05-2-2700a1_20221026_070014.tar.gz
sonic_dump_qa-eth-vt03-2-3700v_20221023_204313.tar.gz