Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How to replace --cpus-per-proc by --map-by
From: Mark Hahn (hahn_at_[hidden])
Date: 2014-05-15 14:16:06


> We're open to suggestion, really - just need some help identifying the best
>way to get this info out there.

well, OpenMPI information is fragmented and sprayed all over.
In some places, there is mention of a wiki to be updated with
an explanation; for other things, a consumer needs to wander around
loosely-related blogs, mail archives, FAQs, usage statements, etc.

For instance, I've been trying to figure out how to do a simple thing,
launch a hybrid job. Assume I have a scheduled, heterogenous cluster
where mpirun simply receives a normal nodefile like this:

clu357
clu357
clu357
clu354
clu354
clu354

and I want to launch a 2-rank, 3-thread-per-rank job. forget about
frills like hwloc or binding.

back when --cpus-per-proc was around, this was obvious and worked
flawlessly. I honestly can't figure out how it works now, though -
for any definition of "now" since:

http://www.open-mpi.org/community/lists/devel/2011/12/10060.php

2011! then there's a dribble more info in 2014 (!) that hints that
"--map-by node:pe=3" might do the trick here:

http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/21193

where did "pe" come from? is it the same as slot, hwthread, core?
why does the documentation make snide comments about how the conventional
understanding of "rank" (~ equivalent to process) might not be true?

most of all, when was the break introduced? at this point, I tell people
that 1.4.3 worked, and that everything after that is broken.

recent releases (I tried 1.7.3, 1.7.5 and 1.8.1) choke on this.
I wonder whether it's having trouble with the fact that a job
gets an arbitrary set of cores via cgroup, and perhaps hwloc
doesn't understand that it can only work within this set...

>>> So please see this URL below(especially the first half part
>>> of it - from 1 to 20 pages):
>>> http://www.slideshare.net/jsquyres/open-mpi-explorations-in-process-affinity-eurompi13-presentation
>>>
>>> Although these slides by Jeff are the explanation for LAMA,
>>> which is another mapping system installed in the openmpi-1.7
>>> series, I guess you can easily understand what is mapping and
>>> binding in general terms.

AFAIKT, the lama slide deck seemed to be only concerned with
affinity settings, which are irrelevant here.

confused,
Mark Hahn.