-
Notifications
You must be signed in to change notification settings - Fork 593
Conversation
@@ -115,7 +115,7 @@ bool StMgrClientMgr::DidAnnounceBackPressure() { | |||
return stream_manager_->DidAnnounceBackPressure(); | |||
} | |||
|
|||
shared_ptr<StMgrClient> StMgrClientMgr::CreateClient(const sp_string& _other_stmgr_id, | |||
StMgrClient* StMgrClientMgr::CreateClient(const sp_string& _other_stmgr_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. This is strange because shared_ptr<> should be safer and I can't tell what can be wrong in the original code. @dnrusakov should have better idea.
@@ -115,7 +115,7 @@ bool StMgrClientMgr::DidAnnounceBackPressure() { | |||
return stream_manager_->DidAnnounceBackPressure(); | |||
} | |||
|
|||
shared_ptr<StMgrClient> StMgrClientMgr::CreateClient(const sp_string& _other_stmgr_id, | |||
StMgrClient* StMgrClientMgr::CreateClient(const sp_string& _other_stmgr_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huijunw
What you have basically done is allocating some memory but never cleaning it up.
It may lead to memory leaks.
You don't observe that problem any more because now you never call a destructor for StMgrClientMgr
(previously shared_ptr
called the destructor of StMgrClientMgr
as soon as the object is out of scope or the destructor of enclosing object is called).
To fix the original issue you need to run stream manager under Valgrind, then reproduce the crash locally, then take a look at the Valgrind log to see what is the precise stack trace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @nwangtw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @nwangtw
Thanks.
Yeah I think you are right. Shared ptr is reference based hence this crash is quite strange. The only possible way is a bad cast somewhere that cause the ref count messed up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when a new physical plan is issued from tmaster, the stmgr-client-manager swaps the old stmgr out from array and explicitly calls quit() destructor.
btw, this pr just reverts it to v0.20.0, those lines have not been changed till v0.20.1 for 4 years and we did not see any issue during those 4 years. thus believe it is safe
Co-authored-by: Huijun Wu <[email protected]>
Co-authored-by: Huijun Wu <[email protected]>
fix #3477