-
Notifications
You must be signed in to change notification settings - Fork 812
Fix PFC_WD test #479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix PFC_WD test #479
Conversation
- added ignore message patterns for PFCWD test to skip port counters discovery - increased timeout for one of the testcases - fixed expected message for another testcase Signed-off-by: Andriy Moroz <[email protected]>
@@ -33,7 +33,7 @@ | |||
set_fact: | |||
pfc_wd_detect_time: 200 | |||
pfc_wd_restore_time: 200 | |||
pfc_wd_restore_time_large: 30000 | |||
pfc_wd_restore_time_large: 50000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current test only run for one port and 50s is fine. Since we are going to iterate all the ports, waiting for such long time will make test run for hours as each port will wait around 2mins simply for this every time. We will have to make this parameter small in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the value of 50s was approved by @marian-pritsak :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
50s is too large for iterate all the ports as above reason. We could set it to 50s for now. Please be aware that we need change it to around 3-5s once we enable the test over all the ports @marian-pritsak
@@ -118,6 +118,7 @@ | |||
- name: Config tests - Check forward action configuration. | |||
vars: | |||
command_to_run: "sonic-cfggen -j {{ run_dir }}/pfc_wd_fwd_action.json --write-to-db" | |||
test_ignore_file: config_test_ignore_messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use set fact to set global variable of test_ignore_file as it will be used across the test, including those wrong config tests to be expected having errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need this only for particular tests cases
setting global variable can mask some potential errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help me understand what cases don't need this within config_test.yml?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the exceptions I've added to config_test_ignore_messages are basically related to counters discovery errors. When syncd(orchagent?) is trying to find out which counters are available it queries all one by one and SAI reports whether particular counter is available. Unfortunately when counter is not supported/implemented/etc SAI also prints an error to the log and LogAnalyzer treats this as test failure.
It is possible that SAI implementation on other platforms does not print such messages (or maybe all counters are available/implemented?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you as these are unrelated errors. From my understanding, these are unrelated errors in all cases in PFCWD tests and we might need to ignore them whenever they happen.
@@ -174,6 +175,7 @@ | |||
- name: Clean up config | |||
vars: | |||
command_to_run: "pfcwd stop" | |||
test_ignore_file: config_test_ignore_messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
@@ -258,6 +258,7 @@ | |||
- name: Apply drop config to {{ pfc_wd_test_port }}. | |||
vars: | |||
command_to_run: "pfcwd start --action drop --restoration-time {{ pfc_wd_restore_time_large }} {{ ports }} {{ pfc_wd_detect_time }}" | |||
test_ignore_file: ignore_pfc_wd_messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider to set to global variable?
Signed-off-by: Andriy Moroz <[email protected]>
@sihuihan88 can you please summarize what is still pending? recent changes in the tests is not working on other testbeds and we need to move forward with it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will merge it first, I will update the test later as provided comments
* msft_github/master: (111 commits) add disconnect/connect vm to testbed-cli.sh (sonic-net#566) [dhcp_relay] Increase sleep duration to allow LAG and BGP to come up (sonic-net#565) [pfcwd]: support t0-116 and clean up the code (sonic-net#563) [fanout]: remove vrf management in arista fanout deploy templates (sonic-net#562) [fanout-switch-deploy] Support multiple speeds and port breakout (sonic-net#561) [extract_log] Improve extract_log script (sonic-net#559) Disable upgrade_sonic retry (sonic-net#560) Support for Sonic fanout (sonic-net#555) [Fanout deploy template] enable root user on fanout switches (sonic-net#557) [link state] match exact dut name in the link list (sonic-net#556) [pfcwd]: cache the ansible facts (sonic-net#554) Fix kernel version check (sonic-net#553) [pfcwd]: increase the pause waiting time and ingore snmp errors (sonic-net#551) fix typo (sonic-net#552) [dir_bcast] enable dir_bcast test on t0-116 topology (sonic-net#550) use command to gather host distribution, kernel version facts (sonic-net#549) [minigraph templte] use consistent VLAN subnet (sonic-net#546) [VM config] skip podset 0 tor 0 routing entry (sonic-net#545) Remove job minigraph_facts from boot_onie (sonic-net#548) [fast-reboot] pass VM IP in as ASCII strings (sonic-net#547) [fast-reboot test] fix syslog reading issue (sonic-net#543) add dataacl to minigraph template (sonic-net#544) [service_acl] Make test reliable when testing Arista service ACL solution (sonic-net#542) [pfcwd]: Iterate functional test over all ports (sonic-net#490) [service_acl] Detect expected output message even if it is followed by other text (sonic-net#541) [testbed]: Remove connection local for port_alias module (sonic-net#540) Revert "[minigraph-gen] fix AclInterface entries in minigraph (sonic-net#538)" (sonic-net#539) [minigraph-gen] fix AclInterface entries in minigraph (sonic-net#538) add retries in onie installation (sonic-net#537) Use connection plugin to install sonic image in ONIE. (sonic-net#536) [dhcp_relay]: Add --relax flag to ptf command (sonic-net#535) [minigraph_facts] use minigraph on DUT (sonic-net#534) Fix minigraph_facts: mkdir recursively (sonic-net#533) [minigraph_facts] use mingraph on DUT to test (sonic-net#532) generate minigraph based on topology file (sonic-net#531) Need to double-escape when using 'args' syntax (sonic-net#529) Fix improper 'local_action' syntax (sonic-net#528) Unify style of 'wait_for' actions across playbooks (sonic-net#527) [minigraph_facts] retrieving dhcp server list from vlan configuration instead of DhcpResources (sonic-net#526) [snmp_facts] increase get command timeout to fix cpu test failure (sonic-net#525) [everflow_test]: Add copy ptftests folder to use the remote.py file (sonic-net#522) Fix snmp_facts on PSU oid (sonic-net#520) Fix snmp queue test (sonic-net#519) [lag_test]: Remove the unnecessary testbed_type check (sonic-net#518) [ip_decap_test]: Support t0-64 topology (sonic-net#517) [acl_test]: Copy ptftests folder for the remote.py file (sonic-net#516) Add test case for PSU (sonic-net#514) [mtu]: Add t1-64-lag topology support for MTU test (sonic-net#513) [ip_decap]: Add t1-64-lag support in the script for the list of source port (sonic-net#512) [everflow]: Add missing spaces in ptf command (sonic-net#511) [acl_test]: Add ptf_platform_dir: ptftests to use customized platform code to support 64 ports (sonic-net#510) [crm]: Implement test for CRM (sonic-net#473) [everflow]: Add support for t1-64-lag topology (sonic-net#502) Fix sonic_image_version: get from sonic_version.yml, no dependency on grub (sonic-net#508) [topology]: Update t1-64-lag topology template to add AclInterfaces piece (sonic-net#505) Pull syncd-rpc with sonic version tag (sonic-net#507) Remove leading and trailing whitespaces when reading veos file (sonic-net#506) [lag_2] remove hard coded interval_count so it can be set by test (sonic-net#503) ptf_runner: Add one line comment for ptf_platform_dir (sonic-net#501) [testbed]: add port speed and fec configuration in sonic fanout (sonic-net#498) [typo]: Replace string t1-lag-64 with t1-64-lag (sonic-net#499) Adding sensor data for S6100 (sonic-net#496) [fib_test]: Add t1-64-lag src_ports in FIB test (sonic-net#497) Fix typo in acl test case name (sonic-net#494) Add one more Mellanox SKU string in everflow_tb_test script (sonic-net#495) Adding sensor data for Z9100 (sonic-net#492) [service_acl] Make test more robust and efficient (sonic-net#489) [dhcp relay test] adding more test scenarios (sonic-net#440) fix sanity check failed to recover (sonic-net#488) Fix PFC_WD test (sonic-net#479) add sonic fanout support (sonic-net#485) Update README.test.md Update README.test.md Fix table caption in testbed.csv and documentation (sonic-net#482) [test case] Add test: restart swss service (sonic-net#483) Add test case port toggle (sonic-net#484) [test infrastructure] allow overriding recover system actioin (sonic-net#480) [SNMP]add new SNMP counters tests to snmp.yml (sonic-net#477) add t0-52 topology (sonic-net#476) add command line option for creategraph.py (sonic-net#475) Add support for additional timestamp format (sonic-net#474) Ignore ansible output in extract_log (sonic-net#472) Add extract_logs action to concatenate logs after log rotate (sonic-net#471) Add ACL ICMP test (sonic-net#2) (sonic-net#465) [pfcwd]:add docker exec to avoid tty error (sonic-net#470) [test_tag]remove pfc_wd test tag from test by tag main yaml (sonic-net#469) [ansible]gather fact by default (sonic-net#468) [sensors]fix Dell S6000 sensors test fail (sonic-net#462) [vlan]improve test to wait a bit longer for config reload (sonic-net#463) one more place for hwsku of new Mellanox 2700 (sonic-net#464) [sensors]sensors test add new hwsku for Mellanox 2700 (sonic-net#461) [PFCWD]: test enhancement (sonic-net#456) [sensors] remove redundant sensor data definitin for sku MSN2100 (sonic-net#460) when call testcase by name, fetch all vms management info from testbed_facts (sonic-net#457) [acl test] error in ACL rule json file for destination ip (sonic-net#434) [sensor data] add sensor data for Mellanox MSN2410 and MSN2100 (sonic-net#449) Fix sku-sensors-data for 7050-QX32 (sonic-net#454) [deploy minigraph]add default enable BGP to deploy minigraph step (sonic-net#455) [upgrade]save bgp UP state after they are brough up (sonic-net#453) [reboot test] call sudo reboot to reboot dut (sonic-net#452) ...
Signed-off-by: Andriy Moroz [email protected]
Type of change
Approach
How did you do it?
How did you verify/test it?
ran on testbed
Any platform specific information?
no