-
Notifications
You must be signed in to change notification settings - Fork 17
Add job option to start only a single process, or n #206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
…e), instead of x on each node. Partial fix for #206
I implemented an additional job option in the slurm adaptor. We should think about if we want to make this an option in all adaptors, or even a setting in the job description. |
Agreed. The default option should be to -only- start one process, as this is the default option for most schedulers, and also nicely fits with the ssh adaptor. A boolean job option "run on all slots" can then be used to start a process for all slots. |
I'm unsure how to call the new option (I've added it to JobOption, seems like something important enough to put in the interface as a first class citizen): /** The number of nodes to run the job on. */
private int nodeCount = 1;
/** The number of processes per node. */
private int processesPerNode = 1;
/** If true, processPerNode processes are started on nodeCount nodes.
* If false, only a single process is started.
*/
private boolean startAllProcesses = false; This name implies you could perhaps also start some processes? Perhaps the other way around makes more sense: private boolean startSingleProcess = false; Also, if a users asks for "nodeCount" nodes and "processesPerNode" processes, getting only a single process (unless some extra option) is probably not what she expects. So making this "false" by default is, I think, best. The ssh adaptor only supports processesPerNode == 1, and will give an error otherwise. Thoughts? |
Fixed |
When running mpi applications, or pilot jobs, it is sometimes needed to start only 1 processes, even though multiple nodes and/or slots are requested from the scheduler. Since this is the default behaviour of most systems it should not be hard to implement. However, we may need an additional job option for this.
The text was updated successfully, but these errors were encountered: