Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)
From: Gus Correa (gus_at_[hidden])
Date: 2014-03-27 15:15:59


Hi John

I just set a PS message ...

On 03/27/2014 02:41 PM, Sasso, John (GE Power & Water, Non-GE) wrote:
> Thank you, Gus! I did go through the mpiexec/mpirun man pages but
wasn't quite clear that -report-bindings was what I was looking for.
So what I did is rerun a program w/ --report-bindings but no bindings
were reported.
>
> Scratching my head, I decided to include --bind-to-core as well.
Voila, the bindings are reported!

The OMPI runtime environment is great.
It adds a lot of information and flexibility to what MPI alone provides.

I don't know your code, so is hard to tell if
-bycore and -bind-to-core are good choices, though.

Here we use those two options for pure MPI jobs.
Minimally you need to make sure there is enough memory per core for each
task, otherwise you may need to skip some cores, to leave enough
RAM for each process (say, with -cpus-per-proc).

If the code is MPI+OpenMP hybrid you may perhaps use -by-socket and
-bind-to-socket, and set
OMP_NUM_THREADS=<the_number_of_cores_in_one_socket>
(assuming there are no nested OpenMP regions, which would complicate
matters)

You can get finer control with the -rankfile option.

Apparently all or most of this syntax is changing in
the latest OMPI 1.7.X, though.

>
> Awesome, but now here is my concern.
If we have OpenMPI-based applications launched as batch jobs
via a batch scheduler like SLURM, PBS, LSF, etc.
(which decides the placement of the app and dispatches it to the compute
hosts),
then will including "--report-bindings --bind-to-core" cause problems?

I don't know all resource managers and schedulers.

I use Torque+Maui here.
OpenMPI is built with Torque support, and will use the nodes and
cpus/cores provided by Torque.
My understanding is that Torque delegates to OpenMPI the process
placement and binding (beyond the list of nodes/cpus available for
the job).

My guess is that OpenPBS behaves the same as Torque.

SLURM and SGE/OGE *probably* have pretty much the same behavior.
A cursory reading of the SLURM web page suggested to me that it
does core binding by default, but don't quote me on that.

I don't know what LSF does, but I would guess there is a
way to do the appropriate bindings, either at the resource manager
level, or at the OpenMPI level (or a combination of both).

Certainly I can test this, but concerned there may be a case where
inclusion of
--bind-to-core would cause an unexpected problem I did not account for.
>
> --john
>

Well, testing and failing is part of this game!
Would the GE manager buy that? :)

I hope this helps,
Gus Correa

>
> -----Original Message-----
> From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Gus Correa
> Sent: Thursday, March 27, 2014 2:06 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)
>
> Hi John
>
> Take a look at the mpiexec/mpirun options:
>
> -report-bindings (this one should report what you want)
>
> and maybe also also:
>
> -bycore, -bysocket, -bind-to-core, -bind-to-socket, ...
>
> and similar, if you want more control on where your MPI processes run.
>
> "man mpiexec" is your friend!
>
> I hope this helps,
> Gus Correa
>
> On 03/27/2014 01:53 PM, Sasso, John (GE Power & Water, Non-GE) wrote:
>> When a piece of software built against OpenMPI fails, I will see an
>> error referring to the rank of the MPI task which incurred the failure.
>> For example:
>>
>> MPI_ABORT was invoked on rank 1236 in communicator MPI_COMM_WORLD
>>
>> with errorcode 1.
>>
>> Unfortunately, I do not have access to the software code, just the
>> installation directory tree for OpenMPI. My question is: Is there a
>> flag that can be passed to mpirun, or an environment variable set,
>> which would reveal the mapping of ranks to the hosts they are on?
>>
>> I do understand that one could have multiple MPI ranks running on the
>> same host, but finding a way to determine which rank ran on what host
>> would go a long way in help troubleshooting problems which may be
>> central to the host. Thanks!
>>
>> --john
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>