-
Notifications
You must be signed in to change notification settings - Fork 0
[portsorch] fix errors when moving port from one lag to another. #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a8322b0
to
c1a6c09
Compare
In schenario that is executed in sonic-mgmt in test_po_update.py a portchannel member is deleted from one portchannel and added to another portchannel: ``` config portchannel member del PortChannel9999 Ethernet0; config portchannel member add PortChannel0001 Ethernet0; ``` It is possible that requests from teamsynd will arrive in different order: ``` 2021-06-17.14:01:19.405245|LAG_MEMBER_TABLE:PortChannel0001:Ethernet0|SET|status:enabled 2021-06-17.14:01:19.405318|LAG_MEMBER_TABLE:PortChannel999:Ethernet0|DEL ``` This reordering happenes because teamsyncd has single event handler/selectable TeamSync::TeamPortSync::onChange() per team device so when two of them are ready it is swss::Select implementation detail in which order they are going to be returned. Signed-off-by: Stepan Blyschak <[email protected]>
c1a6c09
to
eee6941
Compare
orchagent/portsorch.cpp
Outdated
/* Assert the port is not a LAG */ | ||
assert(!port.m_lag_id); | ||
|
||
if (port.m_lag_member_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
port.m_lag_member_id != SAI_NULL_OBJECT_ID
|
||
if (port.m_lag_member_id) | ||
{ | ||
SWSS_LOG_NOTICE("Port %s is already a LAG member", port.m_alias.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print m_lag_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually m_lag_id is only set when port is LAG :(
Signed-off-by: stepanb <[email protected]>
**What I did** Fix the Mem Leak by moving the raw pointers in type_maps to use smart pointers **Why I did it** ``` Indirect leak of 83776 byte(s) in 476 object(s) allocated from: #0 0x7f0a2a414647 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99 #1 0x5555590cc923 in __gnu_cxx::new_allocator, std::allocator > const, referenced_object> > >::allocate(unsigned long, void const*) /usr/include/c++/10/ext/new_allocator.h:115 #2 0x5555590cc923 in std::allocator_traits, std::allocator > const, referenced_object> > > >::allocate(std::allocator, std::allocator > const, referenced_object> > >&, unsigned long) /usr/include/c++/10/bits/alloc_traits.h:460 #3 0x5555590cc923 in std::_Rb_tree, std::allocator >, std::pair, std::allocator > const, referenced_object>, std::_Select1st, std::allocator > const, referenced_object> >, std::less, std::allocator > >, std::allocator, std::allocator > const, referenced_object> > >::_M_get_node() /usr/include/c++/10/bits/stl_tree.h:584 #4 0x5555590cc923 in std::_Rb_tree_node, std::allocator > const, referenced_object> >* std::_Rb_tree, std::allocator >, std::pair, std::allocator > const, referenced_object>, std::_Select1st, std::allocator > const, referenced_object> >, std::less, std::allocator > >, std::allocator, std::allocator > const, referenced_object> > >::_M_create_node, std::allocator > const&>, std::tuple<> >(std::piecewise_construct_t const&, std::tuple, std::allocator > const&>&&, std::tuple<>&&) /usr/include/c++/10/bits/stl_tree.h:634 #5 0x5555590cc923 in std::_Rb_tree_iterator, std::allocator > const, referenced_object> > std::_Rb_tree, std::allocator >, std::pair, std::allocator > const, referenced_object>, std::_Select1st, std::allocator > const, referenced_object> >, std::less, std::allocator > >, std::allocator, std::allocator > const, referenced_object> > >::_M_emplace_hint_unique, std::allocator > const&>, std::tuple<> >(std::_Rb_tree_const_iterator, std::allocator > const, referenced_object> >, std::piecewise_construct_t const&, std::tuple, std::allocator > const&>&&, std::tuple<>&&) /usr/include/c++/10/bits/stl_tree.h:2461 #6 0x5555590e8757 in std::map, std::allocator >, referenced_object, std::less, std::allocator > >, std::allocator, std::allocator > const, referenced_object> > >::operator[](std::__cxx11::basic_string, std::allocator > const&) /usr/include/c++/10/bits/stl_map.h:501 #7 0x5555590d48b0 in Orch::setObjectReference(std::map, std::allocator >, std::map, std::allocator >, referenced_object, std::less, std::allocator > >, std::allocator, std::allocator > const, referenced_object> > >*, std::less, std::allocator > >, std::allocator, std::allocator > const, std::map, std::allocator >, referenced_object, std::less, std::allocator > >, std::allocator, std::allocator > const, referenced_object> > >*> > >&, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&) orchagent/orch.cpp:450 #8 0x5555594ff66b in QosOrch::handleQueueTable(Consumer&, std::tuple, std::allocator >, std::__cxx11::basic_string, std::allocator >, std::vector, std::allocator >, std::__cxx11::basic_string, std::allocator > >, std::allocator, std::allocator >, std::__cxx11::basic_string, std::allocator > > > > >&) orchagent/qosorch.cpp:1763 #9 0x5555594edbd6 in QosOrch::doTask(Consumer&) orchagent/qosorch.cpp:2179 #10 0x5555590c8743 in Consumer::drain() orchagent/orch.cpp:241 #11 0x5555590c8743 in Consumer::drain() orchagent/orch.cpp:238 #12 0x5555590c8743 in Consumer::execute() orchagent/orch.cpp:235 #13 0x555559090dad in OrchDaemon::start() orchagent/orchdaemon.cpp:755 #14 0x555558e9be25 in main orchagent/main.cpp:766 #15 0x7f0a299b6d09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09) ```
**What I did** Fix the Mem Leak by moving the raw pointers in type_maps to use smart pointers **Why I did it** ``` Indirect leak of 83776 byte(s) in 476 object(s) allocated from: #0 0x7f0a2a414647 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99 #1 0x5555590cc923 in __gnu_cxx::new_allocator, std::allocator > const, referenced_object> > >::allocate(unsigned long, void const*) /usr/include/c++/10/ext/new_allocator.h:115 #2 0x5555590cc923 in std::allocator_traits, std::allocator > const, referenced_object> > > >::allocate(std::allocator, std::allocator > const, referenced_object> > >&, unsigned long) /usr/include/c++/10/bits/alloc_traits.h:460 #3 0x5555590cc923 in std::_Rb_tree, std::allocator >, std::pair, std::allocator > const, referenced_object>, std::_Select1st, std::allocator > const, referenced_object> >, std::less, std::allocator > >, std::allocator, std::allocator > const, referenced_object> > >::_M_get_node() /usr/include/c++/10/bits/stl_tree.h:584 #4 0x5555590cc923 in std::_Rb_tree_node, std::allocator > const, referenced_object> >* std::_Rb_tree, std::allocator >, std::pair, std::allocator > const, referenced_object>, std::_Select1st, std::allocator > const, referenced_object> >, std::less, std::allocator > >, std::allocator, std::allocator > const, referenced_object> > >::_M_create_node, std::allocator > const&>, std::tuple<> >(std::piecewise_construct_t const&, std::tuple, std::allocator > const&>&&, std::tuple<>&&) /usr/include/c++/10/bits/stl_tree.h:634 #5 0x5555590cc923 in std::_Rb_tree_iterator, std::allocator > const, referenced_object> > std::_Rb_tree, std::allocator >, std::pair, std::allocator > const, referenced_object>, std::_Select1st, std::allocator > const, referenced_object> >, std::less, std::allocator > >, std::allocator, std::allocator > const, referenced_object> > >::_M_emplace_hint_unique, std::allocator > const&>, std::tuple<> >(std::_Rb_tree_const_iterator, std::allocator > const, referenced_object> >, std::piecewise_construct_t const&, std::tuple, std::allocator > const&>&&, std::tuple<>&&) /usr/include/c++/10/bits/stl_tree.h:2461 #6 0x5555590e8757 in std::map, std::allocator >, referenced_object, std::less, std::allocator > >, std::allocator, std::allocator > const, referenced_object> > >::operator[](std::__cxx11::basic_string, std::allocator > const&) /usr/include/c++/10/bits/stl_map.h:501 #7 0x5555590d48b0 in Orch::setObjectReference(std::map, std::allocator >, std::map, std::allocator >, referenced_object, std::less, std::allocator > >, std::allocator, std::allocator > const, referenced_object> > >*, std::less, std::allocator > >, std::allocator, std::allocator > const, std::map, std::allocator >, referenced_object, std::less, std::allocator > >, std::allocator, std::allocator > const, referenced_object> > >*> > >&, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&) orchagent/orch.cpp:450 #8 0x5555594ff66b in QosOrch::handleQueueTable(Consumer&, std::tuple, std::allocator >, std::__cxx11::basic_string, std::allocator >, std::vector, std::allocator >, std::__cxx11::basic_string, std::allocator > >, std::allocator, std::allocator >, std::__cxx11::basic_string, std::allocator > > > > >&) orchagent/qosorch.cpp:1763 #9 0x5555594edbd6 in QosOrch::doTask(Consumer&) orchagent/qosorch.cpp:2179 #10 0x5555590c8743 in Consumer::drain() orchagent/orch.cpp:241 #11 0x5555590c8743 in Consumer::drain() orchagent/orch.cpp:238 #12 0x5555590c8743 in Consumer::execute() orchagent/orch.cpp:235 #13 0x555559090dad in OrchDaemon::start() orchagent/orchdaemon.cpp:755 #14 0x555558e9be25 in main orchagent/main.cpp:766 #15 0x7f0a299b6d09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09) ```
In schenario that is executed in sonic-mgmt in test_po_update.py a
portchannel member is deleted from one portchannel and added to another
portchannel:
It is possible that requests from teamsynd will arrive to PortsOrch
different order:
Consider following python simulated flow between teamsyncd and orchangent:
This is a fundamental issue of Producer/ConsumerStateTable, thus orchagent must be aware of this and treat it as normal situation and figure out the right order and not crash or print an errors.
Signed-off-by: Stepan Blyschak [email protected]
What I did
Check if port is already a lag member beforehand.
Added an UT to cover this scenario, this UT verifies that SAI API is not called in this case.
Refactored portsorch_ut.cpp by moving out Orchs creation/deletion into SetUp()/TearDown()
Why I did it
To fix errors in log.
How I verified it
Ran test_po_update.py test.
No VS tests are needed, this is non-functional issue, just prints errors in logs.
Details if related