Skip to content

[CI] Time out from test_force_torque_sensor_broadcaster #1586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Mar 26, 2025

Conversation

Juliaj
Copy link
Contributor

@Juliaj Juliaj commented Mar 12, 2025

This change attempts to debug and reproduce CI failures: #1574 and #1559.

In both issues, it appears that the timeout was caused by the test "InterfaceNames_Configure_Success", which primarily tests the on_configure function. No hypothesis has been established so far.

  • The code in "InterfaceNames_Configure_Success" is straightforward.
  • I’ve reviewed the on_configure function in ForceTorqueSensorBroadcaster and the changes introduced around the time of the failure. Other than a change made in realtime_tools regarding RTPublisher, no other changes were found that could impact on_configure.

Copy link
Member

@saikishor saikishor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've a feeling that it happens mostly or only in Humble. Right now the Rolling compatibility build of Humble is also failed in this PR

https://github.com/ros-controls/ros2_controllers/actions/runs/13800804076/job/38602688612#step:5:1

@Juliaj
Copy link
Contributor Author

Juliaj commented Mar 12, 2025

I've a feeling that it happens mostly or only in Humble. Right now the Rolling compatibility build of Humble is also failed in this PR

Setup a docker container with Humble, no repro ...

Copy link

codecov bot commented Mar 12, 2025

Codecov Report

Attention: Patch coverage is 61.53846% with 5 lines in your changes missing coverage. Please review.

Project coverage is 85.07%. Comparing base (eca84fa) to head (601c73b).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
...ster/test/test_force_torque_sensor_broadcaster.cpp 61.53% 4 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1586      +/-   ##
==========================================
- Coverage   85.10%   85.07%   -0.03%     
==========================================
  Files         123      123              
  Lines       11699    11710      +11     
  Branches      995      996       +1     
==========================================
+ Hits         9956     9962       +6     
- Misses       1432     1435       +3     
- Partials      311      313       +2     
Flag Coverage Δ
unittests 85.07% <61.53%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...ster/test/test_force_torque_sensor_broadcaster.cpp 94.66% <61.53%> (-3.18%) ⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Juliaj
Copy link
Contributor Author

Juliaj commented Mar 13, 2025

@christophfroehlich, I only hit the failure once (as Sai pointed out above) and haven't been able to reproduce it in subsequent attempts. Is there anything else I should try to reproduce this issue?

There are failures in the Coverage build and Rolling Binary Build. The test failure in the coverage build appears to be intermittent. The failures in the rolling binary build seem to be related to a dependency issue. Please let me know if these need further investigation.

@saikishor, the controller loading tests reported warnings related to introspection. I'm unable to determine if this is expected behavior.

 1: [ RUN      ] TestLoadAdmittanceController.load_controller
  1: [INFO] [1741819586.401847780] [test_controller_manager]: Using Steady (Monotonic) clock for triggering controller manager cycles.
  1: [INFO] [1741819586.433478203] [test_controller_manager]: Loading hardware 'TestActuatorHardware' 
[...]
  1: [WARN] [1741819586.435269427] [pal_statistics]: Unable to register entity state_interface.joint1/position in ros2_control, as the registry is not found. Try initializing it!
  1: [WARN] [1741819586.435294563] [pal_statistics]: Unable to register entity state_interface.joint1/velocity in ros2_control, as the registry is not found. Try initializing it!

@saikishor
Copy link
Member

saikishor commented Mar 13, 2025

@saikishor, the controller loading tests reported warnings related to introspection. I'm unable to determine if this is expected behavior.

 1: [ RUN      ] TestLoadAdmittanceController.load_controller
  1: [INFO] [1741819586.401847780] [test_controller_manager]: Using Steady (Monotonic) clock for triggering controller manager cycles.
  1: [INFO] [1741819586.433478203] [test_controller_manager]: Loading hardware 'TestActuatorHardware' 
[...]
  1: [WARN] [1741819586.435269427] [pal_statistics]: Unable to register entity state_interface.joint1/position in ros2_control, as the registry is not found. Try initializing it!
  1: [WARN] [1741819586.435294563] [pal_statistics]: Unable to register entity state_interface.joint1/velocity in ros2_control, as the registry is not found. Try initializing it!

Yes, this is normal. I'll reduce the spam, this needs to be handled at the pal_statistics level. I'll take care of it.

I've retriggered the CI, let's see if we can reproduce it.

@christophfroehlich
Copy link
Contributor

We had some similar timeouts in the control_toolbox repo, fixed with ros-controls/control_toolbox#261
I haven't checked yet, but does this apply here maybe too?

@Juliaj
Copy link
Contributor Author

Juliaj commented Mar 26, 2025

We had some similar timeouts in the control_toolbox repo, fixed with ros-controls/control_toolbox#261 I haven't checked yet, but does this apply here maybe too?

@christophfroehlich , thank you for pointing that out. It’s hard to tell whether this is similar to the one you mentioned above. To reduce the chance, I’ve updated the test teardown logic with more cleanup. I also implemented a timeout mechanism to prevent tests from hanging indefinitely. Could you please review these changes and let me know your thoughts?

@saikishor
Copy link
Member

@Juliaj I believe we also have a default timeout of 60 secs for the gtest. At least that's what I thought, isn't it the case @christophfroehlich ?

@christophfroehlich
Copy link
Contributor

Exactly, 60s is the default, if not specified otherwise. Can we split this PR and see if the proper cleanup already helps?

Copy link
Contributor

@christophfroehlich christophfroehlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's give it a try

Copy link
Contributor

@christophfroehlich christophfroehlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference: We already added an auto-shutdown in the destructor
https://github.com/ros-controls/ros2_control/blob/95188b92c9bf628e72dc90a671e1f6c00f0b7611/controller_interface/src/controller_interface_base.cpp#L26-L40

At assigning a new unique_ptr with fts_broadcaster_ = std::make_unique<FriendForceTorqueSensorBroadcaster>(); it should call this transition already.

@christophfroehlich christophfroehlich merged commit 996f030 into ros-controls:master Mar 26, 2025
25 checks passed
@Juliaj Juliaj deleted the ci_issue_1574 branch March 26, 2025 20:53
@christophfroehlich christophfroehlich added the backport-humble This label should be used by maintainers only! Label triggers PR backport to ROS2 humble. label Mar 26, 2025
mergify bot pushed a commit that referenced this pull request Mar 26, 2025
(cherry picked from commit 996f030)

# Conflicts:
#	force_torque_sensor_broadcaster/test/test_force_torque_sensor_broadcaster.cpp
@christophfroehlich
Copy link
Contributor

Ah, I saw now that we haven't backported ros-controls/ros2_control#1979. This might be a reason why it fails more often with the humble version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-humble This label should be used by maintainers only! Label triggers PR backport to ROS2 humble.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants