-
Notifications
You must be signed in to change notification settings - Fork 824
SHM Buffer recovery mechanishm <1.10.x> [8220] #1159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SHM Buffer recovery mechanishm <1.10.x> [8220] #1159
Conversation
4b78b8a
to
e7235b9
Compare
e7235b9
to
dc85d77
Compare
dc85d77
to
d941012
Compare
d941012
to
9dc2b9b
Compare
9dc2b9b
to
a1b0c83
Compare
a1b0c83
to
14c77a8
Compare
…ses for the nodes during entire life-cycle. Signed-off-by: AdolfoMartinez <[email protected]>
Signed-off-by: AdolfoMartinez <[email protected]>
Signed-off-by: AdolfoMartinez <[email protected]>
Signed-off-by: AdolfoMartinez <[email protected]>
Signed-off-by: AdolfoMartinez <[email protected]>
Signed-off-by: AdolfoMartinez <[email protected]>
Signed-off-by: AdolfoMartinez <[email protected]>
0a5e090
to
b1bd8e8
Compare
max_allocations * ((sizeof(BufferNode) + per_allocation_extra_size_) + per_allocation_extra_size_); | ||
uint32_t allocation_extra_size = | ||
sizeof(SegmentNode) + per_allocation_extra_size_ + | ||
(max_allocations * sizeof(BufferNode)) + per_allocation_extra_size_ + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this should be the following (please explain):
(max_allocations * sizeof(BufferNode)) + per_allocation_extra_size_ + | |
(max_allocations * (sizeof(BufferNode) + per_allocation_extra_size_) + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allocation algorithm used by boost consumes some extra memory by each alocation you do. That's because the internal tree used to store every buffer the allocator gives to you, is also in the shared memory segment, so not all the reserved size is free for your use. In the segment construction a test is perform to estimate how much extra memory is used per allocation (per_allocation_extra_size). So a segment is over-sized to be able to hold:
- The segment node allocation.
- The BufferNode pool allocation.
- at least max_allocations number of user buffers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would update the comments, then
* Refs #8219. Change BufferNode to pre-allocated pool with fixed addresses for the nodes during entire life-cycle. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8219. SHM Buffer invalidation implementation. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8219. logWarning -> logInfo when segment overflow. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8219. Optimization. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8219. Style changes. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8219. buffer_recover test. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8219. 'error:' string removed from log msg. Signed-off-by: AdolfoMartinez <[email protected]>
* RobustInterprocessCondition implementation (#1147) * Refs #8212. RobustInterprocessCondition implementation. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8212. Condition tests. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8183. Bad SHM structures alignment in some platforms. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8212. nullptr check. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8212. Re-enable Liveliness tests. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8212. SHM ABI v3. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8212. Fix cmake error & set SHM_DEFAULT_TRANSPORT=OFF. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8212. FIFO strategy in condition notify. Signed-off-by: AdolfoMartinez <[email protected]> * SHM Buffer recovery mechanishm (#1159) * Refs #8219. Change BufferNode to pre-allocated pool with fixed addresses for the nodes during entire life-cycle. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8219. SHM Buffer invalidation implementation. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8219. logWarning -> logInfo when segment overflow. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8219. Optimization. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8219. Style changes. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8219. buffer_recover test. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8219. 'error:' string removed from log msg. Signed-off-by: AdolfoMartinez <[email protected]> * Setting shared memory on by default. Signed-off-by: Miguel Company <[email protected]> * Fix build fail in ROS-CI. Signed-off-by: AdolfoMartinez <[email protected]> * Fix SHM uncaught exceptions. Signed-off-by: AdolfoMartinez <[email protected]> * Refs #8132 Fix DDS unittests to delete entities before exiting Signed-off-by: Laura Martin <[email protected]> * Disable PSM unittests Signed-off-by: Laura Martin <[email protected]> * Fix boost::interprocess::semaphore initialization. Signed-off-by: AdolfoMartinez <[email protected]> * Fix CXX_STANDARD on boost try_compile Co-authored-by: AdolfoMartinez <[email protected]> Co-authored-by: AdolfoMartinez <[email protected]> Co-authored-by: Laura Martin <[email protected]>
To merge after #1147
This PR solves SHM transport issues when there are mutiple subscribers and one of them freezes or is super slow processing messages. This could provoke the "slow" subscriber holds all the publisher buffers leading to a cotinuous publisher's segment overflow, so all subscribers stop receiving messages.
Changes buffer-node / buffer-data allocation structure. Now buffer-nodes are allocated (at the segment creation stage) as fixed buffer-node pool independent from the buffer-data. This way, a node never change its address.
The node contains an atomic validity counter so all subscribers can check whether its reference to the buffer has been invalidated. The node also contains atomic status counters to know how many subscribers are processing the buffer and in how many ports is it enqueued.
Implement mecanishms to invalidate buffers already enqueued or beign processed by remote subscribers, although only buffers not beign processed are invalidated in this version.