Parallelize State call in ECM #451

JShep1 · 2020-11-06T20:43:06Z

Ready for review. Below is a plot of 25 randomly sampled PostUpdate profiled points for both before and after this PR.

The before average timestep of PostUpdate was 33.095 ms and after average time for PostUpdate is 22.479 ms in the case of the shapes_population.sdf. This PR is primarily targeted for larger example and will not significantly affect smaller examples (if there are under 50 entity components, no speedup will occur at all due to the nature of the thread spawning in this PR).

My RTF before this PR after 2 minutes of running shapes_population.sdf was 0.602% and after the PR I am getting about 0.669% RTF.

~~I'm going to experiment with the the number of threads and the number of components each thread handles to see if I can tune these two parameters to get a more optimized RTF.~~ I don't think changing the max thread count to anything higher than std::thread::hardware_concurrency() (the max suggested threads that the hardware supports) would speed up anything, this would likely just bottleneck any additional threads and schedule them in when it puts other threads to sleep. Perhaps we could see some speedup with the amount of entity components each thread will run at minimum, but I'm not so sure this would make a huge difference either, it would only affect scenes in which there are between componentsPerThread and componentsPerThread * maxThreads entity components.

Signed-off-by: John Shepherd <[email protected]>

…b.com/ignitionrobotics/ign-gazebo into jshep1/scenebroadcaster_optimizations Signed-off-by: John Shepherd <[email protected]>

Signed-off-by: John Shepherd <[email protected]>

iche033 · 2020-11-12T00:14:16Z

Here's what I got comparing performance of the State function before (top image) and after (bottom image) the change:

This is from running the shapes_population.sdf world. It does look there it shaved a few ms off the State function. When running with TPE plugin, the RTF before the change is 1.4x% and after is 1.8x%.

Note careful about using remotery for profiling. I noticed that since new threads are spawned every iteration, it made remotery a bit laggy and unresponsive sometimes.

JShep1 · 2020-11-12T06:44:55Z

Good to see that you're seeing improvements... I guess Remotery was a bit wonky for me given all the threads spawning. I'll clean up the logic and make it more generalizable and get the PR in for review soon.

iche033 · 2020-11-12T19:30:45Z

by the way, I wonder if it's possible to reuse threads, e.g. store in member variable?, and have them wait on a condition_variable until there's work to do.

JShep1 · 2020-11-12T19:50:40Z

by the way, I wonder if it's possible to reuse threads, e.g. store in member variable?, and have them wait on a condition_variable until there's work to do.

That's also what I was thinking and was going to attempt something like that. It would certainly reduce noise in Remotery as well as cut down any overhead from spawning so many threads every iteration.

JShep1 · 2020-11-18T19:49:36Z

by the way, I wonder if it's possible to reuse threads, e.g. store in member variable?, and have them wait on a condition_variable until there's work to do.

That's also what I was thinking and was going to attempt something like that. It would certainly reduce noise in Remotery as well as cut down any overhead from spawning so many threads every iteration.

For the record, I've spent a good amount of time brainstorming and trial and error-ing the best solution for this case. I've found that the main problem that lies within the usage of a thread pool + condition variables is that that specific application ends up not applying to this situation. I could go down this route; however, there would be no speedup and would behave the same as if one thread is working through all of the data in series. Since each thread must take a std::unique_lock with a provided mutex to wait on a condition variable, only one thread may be within the execution zone at a time. When it finishes, it unlocks the lock guard so that the next thread can go through and process some data. This would certainly be useful if State(...) had operations to do in the meantime, but since we are specifically blocking, waiting for all of the AddEntityToMessage calls to finish, we would simply be waiting for all the threads to do their work one after the one, equaling a single process approach.

Our desirables for this optimization are:

A thread pool all waiting to do essentially the same job, must be triggered all at once and run concurrently
i. Note the arguments (the chunk of the data which they are responsible for) to the functor that each thread executes may change on every iteration
The process that triggers the thread pool must block until all of the threads are done
The process that triggered the thread pool must then piece together the data that the threads just composed.

There are a lot of approaches that could be taken here, so I'll list the basics that I wish to see and go from there

Being able to trigger all threads instantaneously
Multiple threads all running their chunk of AddEntityToMessage simultaneously
A recalculation of each thread's responsible data only if the this->dataPtr->entityComponents vector has changed (the std::next operation is iterating x amount of times under the hood, doing this every iteration is work that can be avoided if the entity component vector has not changed since last iteration)
Blocking until all threads are done with their AddEntityToMessage work

Based on the above desirables, I don't think we will necessarily be able to have a system which reuses worker threads. Condition variables would be the way we could do this, but as stated above, they would run in series. A cpp std::thread, by definition, is a thread that is designated to do some operation(s) and then finish. So I think it may be a necessity in this case that we continue to spawn threads every iteration. I know it spams Remotery, but I'm not sure there's another option unless there's one I'm not thinking of. I've looked some into std::async with std::futures but the application there is more intended for getting the return value of the function that the std::async is running, in our case, a reference parameter is modified and there is no return value.

I'll implement this approach... which in the end is basically the same approach as the current prototype implementation but with a few more optimizations, and we can go from there.

iche033 · 2020-11-19T02:13:24Z

Thanks for looking into it. I recently ran into this bit of code in ign-gazebo that reminded me of WorkerPool in ign-common. Not sure if that'll help with the issue you're running into. Either way, I think going forward with original approach with spawning threads also sounds good to me.

codecov · 2020-11-19T08:27:52Z

Codecov Report

Merging #451 (7343281) into ign-gazebo3 (4124dea) will increase coverage by 0.03%.
The diff coverage is 100.00%.

@@               Coverage Diff               @@
##           ign-gazebo3     #451      +/-   ##
===============================================
+ Coverage        77.62%   77.66%   +0.03%     
===============================================
  Files              208      208              
  Lines            11527    11574      +47     
===============================================
+ Hits              8948     8989      +41     
- Misses            2579     2585       +6

Impacted Files	Coverage Δ
src/EntityComponentManager.cc	`83.46% <100.00%> (+1.62%)`	⬆️
src/SimulationRunner.cc	`93.40% <0.00%> (-1.13%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4124dea...7343281. Read the comment docs.

iche033

I tested by adding and deleting objects in a shapes.sdf, and spawning objects from fuel and it seems to be work fine for the case of single thread.

Interestingly, I tried shapes_population.sdf and noticed that a sphere is missing (sphere_534):

I also tried generating a shapes_population.sdf world with n set to 100 and I see a couple of models missing. Do you see the same issue?

src/EntityComponentManager.cc

Signed-off-by: John Shepherd <[email protected]>

JShep1 · 2020-12-17T04:45:15Z

I also tried generating a shapes_population.sdf world with n set to 100 and I see a couple of models missing. Do you see the same issue?

Good catch, took me a little bit to find the bug, but I resolved it here: 0ba4684

iche033

The number of objects in shapes_populations.sdf are now correct for me.

I'll leave it to you and @chapulina to decide on the number of threads to use. Otherwise the PR looks good.

src/EntityComponentManager.cc

Signed-off-by: John Shepherd <[email protected]>

Signed-off-by: Louise Poubel <[email protected]>

chapulina

Thanks for the improvements!

chapulina · 2021-01-15T17:26:27Z

The ABI checker is having the same old false positive. All other CI looks good. Merging!

John Shepherd added 18 commits October 14, 2020 12:22

draft

3904c21

Signed-off-by: John Shepherd <[email protected]>

vector to map

c526f78

Signed-off-by: John Shepherd <[email protected]>

Finish conversion

3976133

Signed-off-by: John Shepherd <[email protected]>

Finish implementation of optimization

67a3351

Signed-off-by: John Shepherd <[email protected]>

Merge branch 'ign-gazebo3' into jshep1/scenebroadcaster_optimizations

93cdb81

Clean up

b5e86b6

Signed-off-by: John Shepherd <[email protected]>

Clean up again

31812ff

Signed-off-by: John Shepherd <[email protected]>

Another pass

b74426d

Signed-off-by: John Shepherd <[email protected]>

Another one

5adc0c6

Signed-off-by: John Shepherd <[email protected]>

Merge branch 'ign-gazebo3' into jshep1/scenebroadcaster_optimizations

8eda00c

requested fixes

afee595

Signed-off-by: John Shepherd <[email protected]>

Merge branch 'jshep1/scenebroadcaster_optimizations' of https://githu…

b521bed

…b.com/ignitionrobotics/ign-gazebo into jshep1/scenebroadcaster_optimizations Signed-off-by: John Shepherd <[email protected]>

Merge branch 'ign-gazebo3' into jshep1/scenebroadcaster_optimizations

95cf961

Merge branch 'ign-gazebo3' into jshep1/scenebroadcaster_optimizations

e7acbc8

remove unneeded if statement

755b88d

Signed-off-by: John Shepherd <[email protected]>

init commit

5e0a9cd

Signed-off-by: John Shepherd <[email protected]>

Add multithreading implementation

5a7806d

Signed-off-by: John Shepherd <[email protected]>

prototype version

c729487

Signed-off-by: John Shepherd <[email protected]>

github-actions bot added the 🏰 citadel Ignition Citadel label Nov 6, 2020

Merge branch 'ign-gazebo3' into jshep1/parallelize_ecm

f9281e6

JShep1 changed the base branch from ign-gazebo3 to jshep1/scenebroadcaster_optimizations2 November 10, 2020 00:48

JShep1 changed the base branch from jshep1/scenebroadcaster_optimizations2 to ign-gazebo3 November 10, 2020 00:48

Merge branch 'ign-gazebo3' into jshep1/parallelize_ecm

d5d7a5e

Merge branch 'ign-gazebo3' into jshep1/parallelize_ecm

32dbb2b

JShep1 added the enhancement New feature or request label Nov 20, 2020

Merge branch 'ign-gazebo3' into jshep1/parallelize_ecm

a1bb525

iche033 reviewed Dec 4, 2020

View reviewed changes

src/EntityComponentManager.cc Outdated Show resolved Hide resolved

chapulina reviewed Dec 8, 2020

View reviewed changes

src/EntityComponentManager.cc Outdated Show resolved Hide resolved

John Shepherd added 5 commits December 10, 2020 12:13

Merge branch 'ign-gazebo3' into jshep1/parallelize_ecm

8fbcd5d

Merge branch 'ign-gazebo3' into jshep1/parallelize_ecm

23ab021

Merge branch 'ign-gazebo3' into jshep1/parallelize_ecm

8c274e1

Merge branch 'ign-gazebo3' into jshep1/parallelize_ecm

213071e

fix left out components bug

0ba4684

Signed-off-by: John Shepherd <[email protected]>

JShep1 requested review from iche033 and chapulina December 17, 2020 04:45

iche033 reviewed Dec 19, 2020

View reviewed changes

Merge branch 'ign-gazebo3' into jshep1/parallelize_ecm

45e97f2

chapulina reviewed Dec 31, 2020

View reviewed changes

src/EntityComponentManager.cc Outdated Show resolved Hide resolved

src/EntityComponentManager.cc Outdated Show resolved Hide resolved

chapulina added the performance Runtime performance label Jan 5, 2021

John Shepherd added 2 commits January 7, 2021 12:43

Merge branch 'ign-gazebo3' into jshep1/parallelize_ecm

97d98fb

revert unused function, add more docs

2e25321

Signed-off-by: John Shepherd <[email protected]>

JShep1 requested a review from chapulina January 8, 2021 01:45

JShep1 mentioned this pull request Jan 13, 2021

Improved Parallel ECM State #538

Closed

Merge branch 'ign-gazebo3' into jshep1/parallelize_ecm

72db3ed

chapulina added a commit that referenced this pull request Jan 14, 2021

Suggestions to #451

8d8cff5

Signed-off-by: Louise Poubel <[email protected]>

chapulina mentioned this pull request Jan 14, 2021

Suggestions to #451 #560

Merged

chapulina and others added 2 commits January 14, 2021 23:48

Suggestions to #451 (#560)

e376b1e

Signed-off-by: Louise Poubel <[email protected]>

Merge branch 'ign-gazebo3' into jshep1/parallelize_ecm

7343281

chapulina approved these changes Jan 15, 2021

View reviewed changes

chapulina merged commit ddd7fcd into ign-gazebo3 Jan 15, 2021

chapulina deleted the jshep1/parallelize_ecm branch January 15, 2021 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize State call in ECM #451

Parallelize State call in ECM #451

JShep1 commented Nov 6, 2020 •

edited

Loading

iche033 commented Nov 12, 2020

JShep1 commented Nov 12, 2020

iche033 commented Nov 12, 2020

JShep1 commented Nov 12, 2020 •

edited

Loading

JShep1 commented Nov 18, 2020 •

edited

Loading

iche033 commented Nov 19, 2020

codecov bot commented Nov 19, 2020 •

edited

Loading

iche033 left a comment

JShep1 commented Dec 17, 2020

iche033 left a comment

chapulina left a comment

chapulina commented Jan 15, 2021

Parallelize State call in ECM #451

Parallelize State call in ECM #451

Conversation

JShep1 commented Nov 6, 2020 • edited Loading

iche033 commented Nov 12, 2020

JShep1 commented Nov 12, 2020

iche033 commented Nov 12, 2020

JShep1 commented Nov 12, 2020 • edited Loading

JShep1 commented Nov 18, 2020 • edited Loading

iche033 commented Nov 19, 2020

codecov bot commented Nov 19, 2020 • edited Loading

Codecov Report

iche033 left a comment

Choose a reason for hiding this comment

JShep1 commented Dec 17, 2020

iche033 left a comment

Choose a reason for hiding this comment

chapulina left a comment

Choose a reason for hiding this comment

chapulina commented Jan 15, 2021

JShep1 commented Nov 6, 2020 •

edited

Loading

JShep1 commented Nov 12, 2020 •

edited

Loading

JShep1 commented Nov 18, 2020 •

edited

Loading

codecov bot commented Nov 19, 2020 •

edited

Loading