MPI part 2 #76

weshinsley · 2023-06-29T13:49:22Z

No description provided.

EmmaLRussell

Great, really liked the timing graphs!

content/blog/r-mpi-part-two.md

EmmaLRussell · 2023-07-04T09:39:09Z

content/blog/r-mpi-part-two.md

+# Lightening the load
+
+What if we relax the problem a little. Suppose instead of all processes
+needing all of the results back, only process zero needs the assembled


I was wondering why all the processes needed all the data!

Yes - they never really explained their reasons for that...!

content/blog/r-mpi-part-two.md

EmmaLRussell · 2023-07-04T13:13:27Z

content/blog/r-mpi-part-two.md

+  recv_ptr <- new_data(recv_data)                  # on rank 0
+
+  mpi_time <- -(get_mpi_time())
+  mpi_gather_to_zero(send_ptr, recv_ptr)           # Gather in recv_ptr


Here, is it the case that no data is actually gathered into the recv_ptr for the non-rank-0 processes because they aren't valid pointers to the type of data required? Do you always need to pass a pointer to 0 for the non-gathering processes?

It's a funny business really. Yes, for the non-zero nodes, the recv_ptr arg is ignored. We need all the processes to reach the MPI_Gather command together, and we basically need to supply a recv_ptr anyway to make that line compile syntactically. I think we could alternatively say something like this if we wanted :-

mpi_gather_to_zero(send_ptr, (rank == 0) ? recv_ptr : (void *) 0)

which is both a bit clearer, and slightly messier at the same time...! We are sort of trusting MPI_Gather to not do anything with the invalid thing we send, if we are a non-gathering node.

content/blog/r-mpi-part-two.md

EmmaLRussell · 2023-07-04T13:18:42Z

content/blog/r-mpi-part-two.md

+Lastly, node the run on my desktop is a bit quicker than the cluster job. But
+remember here I'm making poor use of the cluster nodes, only giving them one
+process (one core) each. HPC cluster nodes don't really offer you benefit in
+speed for single core; it's the extra RAM and extra cores that give you the


Might be nice to mention one or two more realistic examples which would make better use of cluster nodes.

I've tried to rewrite this bit...

Co-authored-by: EmmaLRussell <[email protected]>

M-Kusumgar

It was a lovely read, I am quite new to this kind of stuff but your explanations were clear! This type of data exchange with pointers between R and C++ is pretty crazy too. Look forward to how much more optimised it can get when you use the cores in the nodes too!

richfitz

A couple of very belated comments here. I'm still a bit unenlightened I am afraid!

content/blog/r-mpi-part-two.md

richfitz · 2023-09-29T09:49:22Z

content/blog/r-mpi-part-two.md

+
+# MPI Communication - Worst Case
+
+Now we're ready to write some very naive code, in which a number of


I get lost through here working out what we're trying to achieve, I'm afraid.

I think I've improved this a bit?

richfitz · 2023-09-29T09:50:52Z

content/blog/r-mpi-part-two.md

+  of parallelisation will make that faster. Further, we're not really doing
+  that much work in the loop, so the sequential part of getting the memory
+  is a significant chunk of the total time. That limits how much speed-up
+  we could ever achieve.


I believe this also gets into "strong scaling" vs "weak scaling" ideas

richfitz · 2023-09-29T09:51:48Z

content/blog/r-mpi-part-two.md

+What if we relax the problem a little. Suppose instead of all processes
+needing all of the results back, only process zero needs the assembled
+bulk. All the other processes could then only allocate memory for the 
+data they create, and contribute just that to the MPI call.


Am I right that this is the thought process as we shift from shared memory to message passing as the paradigm? It might be worth linking the rust book chapter on this?

I wasn't very intentionally going for that. It was more that I just started with a "worst case" on performance, where every node wants to have all the data from every node - for some reason. (The global sim happens to want to do this with certain bits of the algorithm)

What I'm really aiming for is the more basic idea that throwing more cores/nodes at a job won't necessarily make it faster, if the comms cost (and also memory usage) scales up with the number of nodes - so the graphs aren't very exciting.

MPI part 2

e7e324f

weshinsley requested a review from a team June 29, 2023 13:49

EmmaLRussell approved these changes Jul 4, 2023

View reviewed changes

weshinsley and others added 5 commits July 6, 2023 15:07

Update content/blog/r-mpi-part-two.md

072b850

Co-authored-by: EmmaLRussell <[email protected]>

Update content/blog/r-mpi-part-two.md

750dbdf

Co-authored-by: EmmaLRussell <[email protected]>

Update content/blog/r-mpi-part-two.md

e299362

Co-authored-by: EmmaLRussell <[email protected]>

Final edits?

8448cdd

Bump date

f1b32c4

M-Kusumgar approved these changes Jul 12, 2023

View reviewed changes

weshinsley requested a review from richfitz September 14, 2023 10:43

richfitz reviewed Sep 29, 2023

View reviewed changes

Some improvements after review

c98ac43


		# MPI Communication - Worst Case

		Now we're ready to write some very naive code, in which a number of

MPI part 2 #76

Are you sure you want to change the base?

MPI part 2 #76

Uh oh!

Conversation

weshinsley commented Jun 29, 2023

Uh oh!

EmmaLRussell left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

weshinsley Jul 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

M-Kusumgar left a comment

Choose a reason for hiding this comment

Uh oh!

richfitz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

weshinsley Oct 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

weshinsley Jul 6, 2023 •

edited

Loading

weshinsley Oct 2, 2023 •

edited

Loading