Each instance of mpirun is independent - there is no cross-mpirun coordination. So they will indeed trip over each other as you describe.
In more recent versions, you can restrict the available cores for each mpirun execution by having the external system "bind" OMPI to some subset of the available cores. However, I don't believe Torque provides that capability.
You can also set the default cpu set to be used - try adding -mca orte_cpu_set 1,2 where 1,2 are the cores you want that execution to use.
I can't guarantee it will work as I'm not sure it has been robustly tested, but it is supposed to do what you described (I added it for some other folks at LANL). Let me know and I'll fix it if required.
Alternatively, you can leave the procs unbound as you are doing and they'll run just fine, albeit a little slower.
On Jan 9, 2012, at 8:24 AM, Thompson, Kelly G wrote:
> I am interested in running a handful of mpirun jobs in a single allocation. For example, my allocation is 2 nodes with 8 cores on each node (total of 16 cores). I want to run 2 five-rank jobs and 3 two-rank jobs simultaneously (total of 16 cores) and w/o oversubscribing any single core. I am currently using --mca mpi_paffinity_alone 0 and that appears to work, but it looks like recent versions (1.4+) of OpenMPI have better controls for processor affinity. Is there a better choice of flags for my situation?
> The bigger picture is that I am running 400-600 small unit tests in a single Torque allocation. My testing framework is aware of total available cores and the cores required per test so that the total simultaneous core count never exceeds my allocation. However, if I use any option other than --mca mpi_paffinity_alone 0, mpirun will place multiple jobs on the same cores and leave many cores with nothing to do. Is there a good description for how mpirun assigns jobs to cores particularly in the situation where there are multiple mpirun jobs running on the same allocation?
> Kelly Thompson
> users mailing list