Cleanup of node, process and thread count. #625

jmaassen · 2018-09-17T08:07:13Z

There is some recurring confusion of the semantics of the node, process and thread count in the JobDescription. See for example xenon-middleware/xenon-cli#63 , xenon-middleware/xenon-cli#57 and #206

Currently we have:

private int nodeCount = 1;
private int processesPerNode = 1;
private int threadsPerProcess = -1;
private boolean startSingleProcess = false;

This filters thru to xenon-cli which has command options to set these values.

After some discussion we came to the following command line options for the cli:

--nodes X  (default 1)
--cores-per-node Y (default 1)

and for starting the processes -one- of the following options:

--start-per-job (default)
--start-per-node
--start-per-core

All options are optional. If no values are set, the default is used. This leads to the following behavior:

If you provide no options, you will get 1 node with 1 core and 1 executable being started.
if you provide --cores-per-node 2 you will get 1 node, 2 cores, 1 executable started
if you provide --cores-per-node 2 -start-per-core you will get 1 node, 2 cores, 2 executables started
if you provide --nodes 2 you will get 2 nodes, 1 core each, 1 executable started on first node
if you provide --nodes 2 --cores-per-node 2 you will get 2 nodes, 2 core each, 1 executable started on first node
if you provide --nodes 2 --cores-per-node 2 -start-per-node you will get 2 nodes, 2 core each, 1 executable started on each node (2 in total)
if you provide --nodes 2 --cores-per-node 2 -start-per-core you will get 2 nodes, 2 core each, 1 executable started on each core (4 in total)

This approach is slightly less flexible than the previous one, as it is not possible to directly express starting a job on 4 nodes with 4 processes per node and 4 threads per process (for running an mixed MPI/OpenMP job for example). However, just starting 4 nodes with 16 cores each will probably give you the same result.

For the JobDescription this would result in processesPerNode being renamed into coresPerNode,
threadsPerProcess disappearing, and startSingleProcess turning into some enum.

Any comments?

The text was updated successfully, but these errors were encountered:

sverhoeven · 2018-10-15T12:03:55Z

The SchedulerAdaptorDescription should include methods to determine which counts can be used

sverhoeven · 2018-10-16T07:39:04Z

We could add SchedulerAdaptorDescription.supportsMultiNode() boolean

This will flag the local, ssh as not able to run jobs a cross multiple machines aka nodeCount > 1.

We could also flag GridEngine to not support multi node, making the whole parallel environment mapping much easier. @arnikz do you need GridEngine multi node support?

jmaassen · 2018-10-16T09:10:21Z

Make sense for local, ssh and at, but the GridEngine case is a bit shakey as it does support multinode runs, it just doesn't allow you to ask for nodes ....

jmaassen · 2018-10-16T09:18:58Z

It does seem that SGE supports -l excl=true to allow you to reserve an entire node for yourself.

Not sure if this completely solves the issue though. It would allow correct behavior when you specify "-nodes 10", but I'm not sure what would happen if you would say "-nodes 10 -core-per-node 16" on a 4 core/machine cluster....

jmaassen · 2018-10-16T09:22:51Z

There some info here that discuss getting info on the nodes using qhost

https://serverfault.com/questions/266848/how-to-reserve-complete-nodes-on-sun-grid-engine
https://stackoverflow.com/questions/33372283/requesting-integer-multiple-of-m-cores-per-node-on-sge

jmaassen · 2018-11-22T21:20:36Z

After giving this some further thought, another option would be to go for a concept based on tasks instead, somewhat similar to what SLURM is doing.

The idea is that each task basically represents an "executable" being started somewhere (this can also be a script of course). This task may need 1 or more cores. In addition you may wish to start several of these tasks instead of just one. This straightforward to specify using:

--tasks T (default=1)
--coresPerTask C (default=1)

When you want more than one task (T > 1), these need to be distributed over one or more nodes. You could either fill up each node with as many tasks that will fit (taking the coresPerTask into account, as well as other constraints such as memory), or choose to assign less tasks per node (and thereby needing more nodes). This can simply be specified using:

--tasksPerNode N (default is unset; let scheduler decide).

With this approach, running sequential (single task-single core) and multi threaded (single task-multiple core) jobs is still simple. In addition, it also allows for schedulers to decide what the best task-to-node assignment is (by simply not specifying tasksPerNode) which is useful for SLURM and SGE in some cases. If needed, the values of T, C and N can be used to compute the node count for SLURM and TORQUE, and the slot count for SGE.

When the job started, you can either start the executable once per job, or once for each task. The first seems to be the default on all schedulers:

--start-per-job (default)
--start-per-task

To start once per task, the adaptors can use the nodefile (TORQUE and SGE) or srun (SLURM). This approach will also make it easy to start MPI jobs, by simply using mpirun. I think this approach is easy to understand and the most flexible. I'll try to implement it to see if I run into any issues.

In the JobDescription this would translate to:

private int tasks = 1;
private int coresPerTask = 1;
private int tasksPerNode = -1;
private boolean startPerTask = false;

The rest follows from there....

jmaassen · 2018-11-23T12:40:35Z

Implemented in a27b63f which passes all unit and integration tests.

Not entirely sure about the mapping in SGE yet. Need multi node multi core cluster setup to test this.

sverhoeven mentioned this issue Oct 15, 2018

node, process and thread count for SGE #627

Closed

sverhoeven assigned jmaassen Oct 15, 2018

jmaassen added a commit that referenced this issue Nov 23, 2018

Change to task based job description as discussed in #625

a27b63f

jmaassen added the API label Jun 11, 2019

sverhoeven added this to the 3.0.0 milestone Jun 12, 2019

jmaassen mentioned this issue Jun 12, 2019

Changed JobDescription to tasks model #644

Merged

jmaassen closed this as completed in #644 Jun 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanup of node, process and thread count. #625

Cleanup of node, process and thread count. #625

jmaassen commented Sep 17, 2018 •

edited

Loading

sverhoeven commented Oct 15, 2018

sverhoeven commented Oct 16, 2018

jmaassen commented Oct 16, 2018

jmaassen commented Oct 16, 2018

jmaassen commented Oct 16, 2018

jmaassen commented Nov 22, 2018

jmaassen commented Nov 23, 2018

Cleanup of node, process and thread count. #625

Cleanup of node, process and thread count. #625

Comments

jmaassen commented Sep 17, 2018 • edited Loading

sverhoeven commented Oct 15, 2018

sverhoeven commented Oct 16, 2018

jmaassen commented Oct 16, 2018

jmaassen commented Oct 16, 2018

jmaassen commented Oct 16, 2018

jmaassen commented Nov 22, 2018

jmaassen commented Nov 23, 2018

jmaassen commented Sep 17, 2018 •

edited

Loading