Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)
From: Sasso, John (GE Power & Water, Non-GE) (John1.Sasso_at_[hidden])
Date: 2014-03-27 16:06:23


Yes, I noticed that I could not find --display-map in any of the man pages. Intentional?

-----Original Message-----
From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Gus Correa
Sent: Thursday, March 27, 2014 3:26 PM
To: Open MPI Users
Subject: Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

On 03/27/2014 03:02 PM, Ralph Castain wrote:
> Or use --display-map to see the process to node assignments
>

Aha!
That one was not on my radar.
Maybe because somehow I can't find it in the OMPI 1.6.5 mpiexec man page.
However, it seems to work with that version also, which is great.
(--display-map goes to stdout, whereas -report-bindings goes to stderr,
right?)
Thanks, Ralph!

Gus Correa

> Sent from my iPhone
>
>> On Mar 27, 2014, at 11:47 AM, Gus Correa <gus_at_[hidden]> wrote:
>>
>> PS - The (OMPI 1.6.5) mpiexec default is -bind-to-none, in which case
>> -report-bindings won't report anything.
>>
>> So, if you are using the default,
>> you can apply Joe Landman's suggestion (or alternatively use the
>> MPI_Get_processor_name function, in lieu of uname(&uts); cpu_name =
>> uts.nodename; ).
>>
>> However, many MPI applications benefit from some type of hardware
>> binding, maybe yours will do also, and as a bonus -report-bindings will tell you where each rank ran.
>> mpiexec's -tag-output is also helpful for debugging, but won't tell
>> you the node name, just the MPI rank.
>>
>> You can setup a lot of these things as your preferred defaults, via
>> mca parameters, and omit them from the mpiexec command line.
>> The trick is to match each mpiexec option to the appropriate mca
>> parameter, as the names are not exactly the same.
>> "ompi-info --all" may help in that regard.
>> See this FAQ:
>> http://www.open-mpi.org/faq/?category=tuning#setting-mca-params
>>
>> Again, the OMPI FAQ page is your friend! :)
>> http://www.open-mpi.org/faq/
>>
>> I hope this helps,
>> Gus Correa
>>
>>> On 03/27/2014 02:06 PM, Gus Correa wrote:
>>> Hi John
>>>
>>> Take a look at the mpiexec/mpirun options:
>>>
>>> -report-bindings (this one should report what you want)
>>>
>>> and maybe also also:
>>>
>>> -bycore, -bysocket, -bind-to-core, -bind-to-socket, ...
>>>
>>> and similar, if you want more control on where your MPI processes run.
>>>
>>> "man mpiexec" is your friend!
>>>
>>> I hope this helps,
>>> Gus Correa
>>>
>>>> On 03/27/2014 01:53 PM, Sasso, John (GE Power & Water, Non-GE) wrote:
>>>> When a piece of software built against OpenMPI fails, I will see an
>>>> error referring to the rank of the MPI task which incurred the failure.
>>>> For example:
>>>>
>>>> MPI_ABORT was invoked on rank 1236 in communicator MPI_COMM_WORLD
>>>>
>>>> with errorcode 1.
>>>>
>>>> Unfortunately, I do not have access to the software code, just the
>>>> installation directory tree for OpenMPI. My question is: Is there
>>>> a flag that can be passed to mpirun, or an environment variable
>>>> set, which would reveal the mapping of ranks to the hosts they are on?
>>>>
>>>> I do understand that one could have multiple MPI ranks running on
>>>> the same host, but finding a way to determine which rank ran on
>>>> what host would go a long way in help troubleshooting problems
>>>> which may be central to the host. Thanks!
>>>>
>>>> --john
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users