On Apr 25, 2013, at 5:33 PM, Vladimir Yamshchikov <yaximik@gmail.com> wrote:

$NSLOTS is what requested by -pe openmpi <ARG> in the script, my understanding that by default it is threads.

No - it is the number of processing elements (typically cores) that are assigned to your job.

$NSLOTS processes each spinning -t <ARG> threads is not what is wanted as each process could spin off more threads then there are physical or logical cores per node, thus degrading performance or even crashing the node. Even when -t <ARG. is kept within permissive boundaries (2, 4, or 6 cores per processor or 2, 4, 8, or 12 cores per node), it is still not clear how these cores are utilized in multithreaded runs.
My question is then - how to correctly formulate resource scheduling for programs designed to run in multithreaded mode? For those involved in bioinformatics, examples are bwa with -t <ARG> option or blast+ with number_of_threads <ARG> option specified.

What you want to do is:

1. request a number of slots = the number of application processes * the number of threads each process will run

2. execute mpirun with the --cpus-per-proc N option, where N = the number of threads each process will run.

This will ensure you have one core for each thread. Note, however, that we don't actually bind a thread to the core - so having more threads than there are cores on a socket can cause a thread to bounce across sockets and (therefore) potentially across NUMA regions.




On Thu, Apr 25, 2013 at 2:09 PM, Ralph Castain <rhc@open-mpi.org> wrote:
Depends on what NSLOTS is and what your program's "-t" option does :-)

Assuming your "-t" tells your program the number of threads to start, then the command you show will execute NSLOTS number of processes, each of which will spin off the number of indicated threads.


On Apr 25, 2013, at 11:39 AM, Vladimir Yamshchikov <yaximik@gmail.com> wrote:

> Hi all,
>
> The FAQ has excellent entries on how to schedule on a SGE cluster non-MPI jobs, yet only simple jobs are exemplified. But wnat about jobs that can be run in multithreaded mode, say specifying option -t number_of_threads? In other words, consider a command an esample qsub script:
> ..........
> #$ -pe openmpi 16
> ..........
>
> mpirun -np $NSLOTS my_program -t 16 > out_file
>
> Will that launch a program to run in 16 threads (as desired) or will this launch 16 instances of a program wiith each instance trying to run in 16 threads (certainly not desired)?
> _______________________________________________
> users mailing list
> users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users