Skip to content

RAxML Threading Blog #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: source
Choose a base branch
from
Draft

RAxML Threading Blog #67

wants to merge 8 commits into from

Conversation

weshinsley
Copy link
Contributor

No description provided.

@weshinsley weshinsley requested review from muppi1993 and a team and removed request for muppi1993 August 25, 2022 12:36
Copy link
Member

@richfitz richfitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

The only thing I might add, which you circle around, is that people's first cut should surely always be to run as many independent jobs as possible and then start using threading where you have more cores than jobs. So if you're going to run 10,000 RaXML jobs and can only get 100 jobs on the cluster it's impossible to imagine any thread count greater than 1 being the most efficient.


One of the threads (if you use more than one) acts as an
administrator, which somewhat explains the lack of gain from 1 to 2 threads;
after that, from 2 to 6 threads, we're around 90% efficient. Beyond that, the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might want a footnote explaining exactly what you mean here by efficient

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My maths is a bit shoddy here - will rewrite...


The results are a bit confusing here and there; the 10-core is surprisingly
erratic, and needs some deeper investigation. The overhead of stacking up 4 and 8 core
jobs is a bit more than we might; perhaps those jobs are using more of the node than
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"than we might" => "than we might want"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or "than we might expect"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks - "expect" was what I first had in mind - but have gone for "expect or want"

---
author: Wes Hinsley
date: 2022-08-23
title: The risk of over-threading
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mo threads, mo problems?

{{< figure src="/img/raxml_032.png" alt="RAxML with over 8 cores" >}}

.. and we actually start to make things slower - using 32 cores performs
comparably to using 8. There just isn't enough parallel work for all the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@weshinsley weshinsley Aug 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think looking at task manager it's still in a parallel section, but just performing much worse - it could be there's a sequential bit in there, or some kind of barrier sync... I'll mention Amdahl as well.

Copy link
Contributor Author

@weshinsley weshinsley Aug 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(But I think Amdahl would only limit how good you can get with increasing threads - I don't think that would explain a u-turn and the overall time getting slower would it?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

The only thing I might add, which you circle around, is that people's first cut should surely always be to run as many independent jobs as possible and then start using threading where you have more cores than jobs. So if you're going to run 10,000 RaXML jobs and can only get 100 jobs on the cluster it's impossible to imagine any thread count greater than 1 being the most efficient.

I probably need one more test to be sure of this - which is how 32 cores stack on a single node...

the graph ended. Now we have more cores, so let's throw them all at the
problem...

{{< figure src="/img/raxml_032.png" alt="RAxML with over 8 cores" >}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on graphs with time, adding "(lower is better)" to the y axis might help the reader

Copy link
Contributor

@EmmaLRussell EmmaLRussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What an adventure! 🙂


The results are a bit confusing here and there; the 10-core is surprisingly
erratic, and needs some deeper investigation. The overhead of stacking up 4 and 8 core
jobs is a bit more than we might; perhaps those jobs are using more of the node than
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or "than we might expect"?

Copy link
Member

@r-ash r-ash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one!

@weshinsley weshinsley marked this pull request as draft June 16, 2023 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants