Skip to content

Lab5 - add tx error monitor #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from
Closed

Lab5 - add tx error monitor #1

wants to merge 10 commits into from

Conversation

EdenGri
Copy link
Owner

@EdenGri EdenGri commented Mar 10, 2022

What I did

Add new orch agent txmonitororch. the txmonitor orch monitor the tx errors on switch ports. In case the number of errors on one port in a configurable time period reaches the configurable thresholds , port is deemed as “Not OK”.

Why I did it

In order to get the status of the tx for each port in the switch.

How I verified it

enter the following CLI commands:
test 1:

  1. show tx-error-monitor status
  2. choose one oid of port with the status "ok"
  3. stop the counters:
    counterpoll port disable
  4. set the tx counter of the port to be above threshold:
    redis-cli -n 2 HSET COUNTERS:oid: SAI_PORT_STAT_IF_OUT_ERRORS "200"
  5. verify that the status of the port change to not ok:
    show tx-error-monitor status
    test 2:
  6. show tx-error-monitor status
  7. choose one oid of port with the status "ok"
  8. stop the counters:
    counterpoll port disable
  9. set the threshold to 200:
    config tx-config threshold 200
  10. set the tx counter of the port to be above threshold:
    redis-cli -n 2 HSET COUNTERS:oid: SAI_PORT_STAT_IF_OUT_ERRORS "200"
  11. verify that the status of the port is still ok:
    show tx-error-monitor status

Details if related

@EdenGri EdenGri closed this Aug 23, 2022
EdenGri pushed a commit that referenced this pull request Oct 10, 2022
Currently, ASAN sometimes reports the BufferOrch::m_buffer_type_maps and QosOrch::m_qos_maps as leaked. However, their lifetime is the lifetime of a process so they are not really 'leaked'.
This also adds a simple way to add more suppressions later if required.

Example of ASAN report:

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7f96aa952d30 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xead30)
    #1 0x55ca1da9f789 in __static_initialization_and_destruction_0 /__w/2/s/orchagent/bufferorch.cpp:39
    #2 0x55ca1daa02af in _GLOBAL__sub_I_bufferorch.cpp /__w/2/s/orchagent/bufferorch.cpp:1321
    #3 0x55ca1e2a9cd4  (/usr/bin/orchagent+0xe89cd4)

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7f96aa952d30 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xead30)
    #1 0x55ca1da6d2da in __static_initialization_and_destruction_0 /__w/2/s/orchagent/qosorch.cpp:80
    #2 0x55ca1da6ecf2 in _GLOBAL__sub_I_qosorch.cpp /__w/2/s/orchagent/qosorch.cpp:2000
    #3 0x55ca1e2a9cd4  (/usr/bin/orchagent+0xe89cd4)

- What I did
Added an lsan suppression config with static variable leak suppression

- Why I did it
To suppress ASAN false positives

- How I verified it
Run a test that produces the static variable leaks report and checked that report has these leaks suppressed.

Signed-off-by: Yakiv Huryk <[email protected]>
EdenGri pushed a commit that referenced this pull request Oct 10, 2022
…onic-net#2446)

* Add events publish

* Added header file

* signature fix

* syntax

* syntax

* syntax

* syntax

* syntax

* syntax

* Updated fake code

* Remove if and log messages for event_publish

* Remove if and log messages for event_publish (#1)

* Remove event_handle_t from signature and add globally

* Remove extern orchdaemon.cpp

* Revert unneeded changes

Co-authored-by: zbud-msft <[email protected]>
Co-authored-by: Zain Budhwani <[email protected]>
EdenGri pushed a commit that referenced this pull request Dec 7, 2022
Currently, ASAN sometimes reports the BufferOrch::m_buffer_type_maps and QosOrch::m_qos_maps as leaked. However, their lifetime is the lifetime of a process so they are not really 'leaked'.
This also adds a simple way to add more suppressions later if required.

Example of ASAN report:

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7f96aa952d30 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xead30)
    #1 0x55ca1da9f789 in __static_initialization_and_destruction_0 /__w/2/s/orchagent/bufferorch.cpp:39
    #2 0x55ca1daa02af in _GLOBAL__sub_I_bufferorch.cpp /__w/2/s/orchagent/bufferorch.cpp:1321
    #3 0x55ca1e2a9cd4  (/usr/bin/orchagent+0xe89cd4)

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7f96aa952d30 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xead30)
    #1 0x55ca1da6d2da in __static_initialization_and_destruction_0 /__w/2/s/orchagent/qosorch.cpp:80
    #2 0x55ca1da6ecf2 in _GLOBAL__sub_I_qosorch.cpp /__w/2/s/orchagent/qosorch.cpp:2000
    #3 0x55ca1e2a9cd4  (/usr/bin/orchagent+0xe89cd4)

- What I did
Added an lsan suppression config with static variable leak suppression

- Why I did it
To suppress ASAN false positives

- How I verified it
Run a test that produces the static variable leaks report and checked that report has these leaks suppressed.

Signed-off-by: Yakiv Huryk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant