Skip to content

Add job option to start only a single process, or n #206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nielsdrost opened this issue Mar 7, 2014 · 4 comments
Closed

Add job option to start only a single process, or n #206

nielsdrost opened this issue Mar 7, 2014 · 4 comments

Comments

@nielsdrost
Copy link
Contributor

When running mpi applications, or pilot jobs, it is sometimes needed to start only 1 processes, even though multiple nodes and/or slots are requested from the scheduler. Since this is the default behaviour of most systems it should not be hard to implement. However, we may need an additional job option for this.

@nielsdrost nielsdrost self-assigned this Mar 7, 2014
nielsdrost added a commit that referenced this issue Mar 13, 2014
…e), instead of x on each node. Partial fix for #206
@nielsdrost
Copy link
Contributor Author

I implemented an additional job option in the slurm adaptor. We should think about if we want to make this an option in all adaptors, or even a setting in the job description.

@jmaassen
Copy link
Member

Agreed. The default option should be to -only- start one process, as this is the default option for most schedulers, and also nicely fits with the ssh adaptor. A boolean job option "run on all slots" can then be used to start a process for all slots.

@nielsdrost
Copy link
Contributor Author

I'm unsure how to call the new option (I've added it to JobOption, seems like something important enough to put in the interface as a first class citizen):

    /** The number of nodes to run the job on. */
    private int nodeCount = 1;

    /** The number of processes per node. */
    private int processesPerNode = 1;

    /** If true, processPerNode processes are started on nodeCount nodes.
     *  If false, only a single process is started.
     */
    private boolean startAllProcesses = false;

This name implies you could perhaps also start some processes? Perhaps the other way around makes more sense:

private boolean startSingleProcess = false;

Also, if a users asks for "nodeCount" nodes and "processesPerNode" processes, getting only a single process (unless some extra option) is probably not what she expects. So making this "false" by default is, I think, best. The ssh adaptor only supports processesPerNode == 1, and will give an error otherwise.

Thoughts?

@nielsdrost nielsdrost added the API label Aug 11, 2015
@jmaassen
Copy link
Member

Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants