Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)
From: Gus Correa (gus_at_[hidden])
Date: 2014-03-27 14:47:42


PS - The (OMPI 1.6.5) mpiexec default is -bind-to-none,
in which case -report-bindings won't report anything.

So, if you are using the default,
you can apply Joe Landman's suggestion
(or alternatively use the MPI_Get_processor_name function,
in lieu of uname(&uts); cpu_name = uts.nodename; ).

However, many MPI applications benefit from some type of hardware
binding, maybe yours will do also, and as a bonus
-report-bindings will tell you where each rank ran.
mpiexec's -tag-output is also helpful for debugging,
but won't tell you the node name, just the MPI rank.

You can setup a lot of these things as your preferred defaults,
via mca parameters, and omit them from the mpiexec command line.
The trick is to match each mpiexec option to
the appropriate mca parameter, as the names are not exactly the same.
"ompi-info --all" may help in that regard.
See this FAQ:
http://www.open-mpi.org/faq/?category=tuning#setting-mca-params

Again, the OMPI FAQ page is your friend! :)
http://www.open-mpi.org/faq/

I hope this helps,
Gus Correa

On 03/27/2014 02:06 PM, Gus Correa wrote:
> Hi John
>
> Take a look at the mpiexec/mpirun options:
>
> -report-bindings (this one should report what you want)
>
> and maybe also also:
>
> -bycore, -bysocket, -bind-to-core, -bind-to-socket, ...
>
> and similar, if you want more control on where your MPI processes run.
>
> "man mpiexec" is your friend!
>
> I hope this helps,
> Gus Correa
>
> On 03/27/2014 01:53 PM, Sasso, John (GE Power & Water, Non-GE) wrote:
>> When a piece of software built against OpenMPI fails, I will see an
>> error referring to the rank of the MPI task which incurred the failure.
>> For example:
>>
>> MPI_ABORT was invoked on rank 1236 in communicator MPI_COMM_WORLD
>>
>> with errorcode 1.
>>
>> Unfortunately, I do not have access to the software code, just the
>> installation directory tree for OpenMPI. My question is: Is there a
>> flag that can be passed to mpirun, or an environment variable set, which
>> would reveal the mapping of ranks to the hosts they are on?
>>
>> I do understand that one could have multiple MPI ranks running on the
>> same host, but finding a way to determine which rank ran on what host
>> would go a long way in help troubleshooting problems which may be
>> central to the host. Thanks!
>>
>> --john
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users