Skip to content

iox-1855 Return in WaitSet::wait if data was send before WaitSet::attachState #1856

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

mossmaurice
Copy link
Contributor

@mossmaurice mossmaurice commented Jan 24, 2023

Pre-Review Checklist for the PR Author

  1. Code follows the coding style of CONTRIBUTING.md
  2. Tests follow the best practice for testing
  3. Changelog updated in the unreleased section including API breaking changes
  4. Branch follows the naming format (iox-123-this-is-a-branch)
  5. Commits messages are according to this guideline
  6. Update the PR title
    • Follow the same conventions as for commit messages
    • Link to the relevant issue
  7. Relevant issues are linked
  8. Add sensible notes for the reviewer
  9. All checks have passed (except task-list-completed)
  10. All touched (C/C++) source code files from iceoryx_hoofs are added to ./clang-tidy-diff-scans.txt
  11. Assign PR to reviewer

Notes for Reviewer

Checklist for the PR Reviewer

  • Commits are properly organized and messages are according to the guideline
  • Code according to our coding style and naming conventions
  • Unit tests have been written for new behavior
  • Public API changes are documented via doxygen
  • Copyright owner are updated in the changed files
  • All touched (C/C++) source code files from iceoryx_hoofs have been added to ./clang-tidy-diff-scans.txt
  • PR title describes the changes

Post-review Checklist for the PR Author

  1. All open points are addressed and tracked via issues

References

@codecov
Copy link

codecov bot commented Jan 24, 2023

Codecov Report

Merging #1856 (ab38967) into master (5f4dd12) will decrease coverage by 0.01%.
The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1856      +/-   ##
==========================================
- Coverage   75.42%   75.41%   -0.01%     
==========================================
  Files         377      377              
  Lines       14619    14625       +6     
  Branches     2084     2086       +2     
==========================================
+ Hits        11026    11030       +4     
  Misses       2961     2961              
- Partials      632      634       +2     
Flag Coverage Δ
unittests 75.08% <66.66%> (-0.01%) ⬇️
unittests_timing 15.26% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...sh/include/iceoryx_posh/internal/popo/wait_set.inl 91.47% <66.66%> (-1.22%) ⬇️

@budrus
Copy link
Contributor

budrus commented Jan 25, 2023

@elfenpiff @mossmaurice If for HAS_DATA the waitset shall return if there are any samples in the queue, I fear this change is not the whole story. Because if it would be like that, why would we need this fix? And I always had the assumption that HAS_DATA checks whether there are samples in the queue without the need for an active trigger

@mossmaurice mossmaurice force-pushed the iox-1855-trigger-states-on-attach-in-waitset branch from ec40033 to bea7318 Compare January 25, 2023 09:36
@mossmaurice
Copy link
Contributor Author

mossmaurice commented Jan 25, 2023

@budrus

If for HAS_DATA the waitset shall return if there are any samples in the queue, I fear this change is not the whole story. Because if it would be like that, why would we need this fix? And I always had the assumption that HAS_DATA checks whether there are samples in the queue without the need for an active trigger

Good point, with the changes on this PR, WaitSet::wait would not return on a 2nd call and only after WaitSet::attachState. Currently, the triggers are only returned if someone rang the bell at the m_conditionListener:

WaitSet<Capacity>::waitAndReturnTriggeredTriggers(const WaitFunction& wait) noexcept
{
if (m_conditionListener.wasNotified())
{
this->acquireNotifications(wait);
}
NotificationInfoVector triggers = createVectorWithTriggeredTriggers();
if (!triggers.empty())
{
return triggers;
}
acquireNotifications(wait);
return createVectorWithTriggeredTriggers();

I'll further investigate.

@mossmaurice
Copy link
Contributor Author

mossmaurice commented Jan 25, 2023

@budrus Turns out everything is fine.

In the following scenario:

  1. Publisher is created
  2. Subscriber is created
  3. Publisher sends two samples to queue
  4. WaitSet is created
  5. Publisher is attached to WaitSet
  6. WaitSet::wait immediately returns due to the fix on this PR
  7. Calling WaitSet::wait again will, despite the missing ConditionNotifier::notify call, return again immediately due to the fact that the m_activeNotifications still contains the notification from the last time, see below.

WaitSet<Capacity>::createVectorWithTriggeredTriggers() noexcept
{
NotificationInfoVector triggers;
if (!m_activeNotifications.empty())
{
for (uint64_t i = m_activeNotifications.size() - 1U;; --i)
{
auto index = m_activeNotifications[i];
auto& trigger = m_triggerArray[index];
bool doRemoveNotificationId = !static_cast<bool>(trigger);
if (!doRemoveNotificationId && trigger->isStateConditionSatisfied())
{
cxx::Expects(triggers.push_back(&m_triggerArray[index]->getNotificationInfo()));
doRemoveNotificationId = (trigger->getTriggerType() == TriggerType::EVENT_BASED);
}
if (doRemoveNotificationId)
{
m_activeNotifications.erase(m_activeNotifications.begin() + i);
}
if (i == 0U)
{
break;
}
}
}
return triggers;

I'll write a separate test case for that scenario.

@mossmaurice mossmaurice requested a review from budrus January 25, 2023 10:58
@budrus
Copy link
Contributor

budrus commented Jan 25, 2023

@mossmaurice Thanks, I got it now. The HAS_DATA mechanism relies on a set bit in the activationNotifications which needs a triggered Semaphore to get set. So it is only a kind of startup problem and the fix is valid. The benefit with this fix is that we do not change the complex algorithm that deals with activationNotifications and queue size polling during runtime

budrus
budrus previously approved these changes Jan 25, 2023
…t::wait' for a 2nd time

Signed-off-by: Simon Hoinkis <[email protected]>
@mossmaurice mossmaurice merged commit b97f26a into eclipse-iceoryx:master Jan 25, 2023
@mossmaurice mossmaurice deleted the iox-1855-trigger-states-on-attach-in-waitset branch January 25, 2023 13:42
@elfenpiff
Copy link
Contributor

@elBoberido @mossmaurice @budrus I just saw the PR now and it is fine like the problem is solved BUT it demonstrates that states are an abomination directly from hell. It is the wrong layer to handle this, the waitset should only handle events and the layer above (the user using the waitset) handles states if they like. This would make things much more faster and the design more readable.

The responsibilities were:

  • a waitset listens on a trigger
  • events (also for states) on attachments will fire such a trigger

With this implementation we break those clear responsibilities and the waitset suddenly triggers. But I see no better alternative. Good job @mossmaurice but we have to get rid of those states they do not make sense.

@budrus
Copy link
Contributor

budrus commented Jan 30, 2023

@elfenpiff In the specific rmw_iceoryx case this would mean the rwm_wait() implementation has to take care of this. Which would be possible. A ROS user doesn't know anything about this and an event-only behavior of the waitset led to the deadlock. In DDS you have all these possibilities to configure your conditions, and the entity is called waitset ;-)

So the state functionality is needed here. Now the question is if the iceoryx waitset should provide this or if users or framework layers like rmw have to care of this. For me it is not really a sin that the waitset provides this functionality. Otherwise many people would have to build it on top

@elBoberido
Copy link
Member

@budrus @elfenpiff I see this as layered approach. One layer provides the basics and does not even start a thread like the Listener and the next layer provides what we call now WaitSet and Listener. Depending on where one is in the layer, she can choose the right component.

cc @mossmaurice

@elfenpiff
Copy link
Contributor

@elBoberido You are right and with the upcoming iceoryx rust next gen version I implemented it exactly like this.

  • First layer provides triggers and notifiers
  • Second layer provides a reactor (a sinkhole where multiple events are coming together at one waiting point, I know the DDS/ROS guys call this waitset but the literature and the rest of the world calls it reactor)
  • Third layer can add threads and more advanced functionality. Here one would add the states but in my opinion this is still a design flaw. There is a better approach of handling this which I can present in the near future but would be to much for this issue here.

But to be clear. ROS requires it (sadly) so therefore I would implemented it somewhere between here and the end user layer so that high level users do not have to take care of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Solves a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WaitSet::wait does not return if data was send before WaitSet::attachState
5 participants