Skip to content

Fix failing DPB LAG tests #1919

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 23, 2021
Merged

Conversation

dgsudharsan
Copy link
Collaborator

What I did
Fixed the failed DPB LAG tests

Why I did it
The DPB LAG test was not passing. The test case was meant to cover the race condition scenario where on port removal, port delete arrives at orchagent before lag removal(The dependencies would be removed by breakout command automatically). So port delete should wait until dependencies are removed.
To mimic this, app DB LAG member table should be created first and removed after dpb command. The test was earlier mimicing with cfg DB LAG table (which is not the actual use case). Moreover mimicking config_Db lag table has a race condition where removeLAGMember is called by teamd during link down and thus the flow completes and when actual remove lag member from config_db is called, teamd crashes. This results in further lag remove call to be unhandled leaving residual LAG objects in asic DB failing the tests.

How I verified it
Modifying the tests to operate on top of app DB instead of config DB.

Details if related


# 2. Add Ethernet0 to PortChannel0001.
self.dvs_lag.create_port_channel_member(lag, p.get_name())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is a common fixture.. is there an issue with using common function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with using the common function is it acts on config_db, which involves flow through teamd and teamd can remove port channel member based on link down and some events.
In the DPB flow this config removal will be handled by teamd and it would generated app_db removal call. But this will come after port remove call which was the issue.
So in test case we should delay the app_db removal and not start again with config_db removal.

@prsunny
Copy link
Collaborator

prsunny commented Sep 21, 2021

@zhenggen-xu to review

@prsunny prsunny merged commit a89d1f8 into sonic-net:master Sep 23, 2021
raphaelt-nvidia pushed a commit to raphaelt-nvidia/sonic-swss that referenced this pull request Oct 5, 2021
*Fixed the failed DPB LAG tests
EdenGri pushed a commit to EdenGri/sonic-swss that referenced this pull request Feb 28, 2022
…c-net#1919)

#### What I did
Change filename per review comments from last PR.
Removed print statement from change_applier
Added vlan validator & test code for the same
@dgsudharsan dgsudharsan deleted the dpb_lag_test branch March 9, 2023 02:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants