Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] oob mca question
From: Aaron Knister (aaron.knister_at_[hidden])
Date: 2009-11-12 22:09:11


Thanks! I appreciate the response.

On Nov 12, 2009, at 9:54 PM, Ralph Castain wrote:

> That is indeed the expected behavior, and your solution is the
> correct one.
>
> The orted has no way of knowing which interface mpirun can be
> reached on, so it has no choice but to work its way through the
> available ones. Because of the ordering in the way the OS reports
> the interfaces, it is picking up the public one first - so that is
> the first one used.
>
> Telling it the right one to use is the only solution.
>
> On Nov 12, 2009, at 7:35 PM, Aaron Knister wrote:
>
>> Dear List,
>>
>> I'm having a really weird issue with openmpi - version 1.3.3
>> (version 1.2.8 doesn't seem to exhibit this behavior). Essentially
>> when I start jobs from the cluster front-end node using mpirun,
>> mpirun sits idle for up to a minute and a half (for 30 nodes)
>> before running the command I've given it. Running the same command
>> on any other node in the cluster returns in a fraction of a second.
>> Upon further research it appears its an issue with the way orted on
>> the compute nodes are attempting to talk back to the front-end
>> node. When I launch mpirun from the front-end node this is the
>> process it spawns on the compute node (public ip scrambled for
>> security purposes)-
>>
>> orted --daemonize -mca ess env -mca orte_ess_jobid 1816657920 -mca
>> orte_ess_vpid 1 -mca orte_ess_num_procs 3 --hnp-uri
>> 1816657920.0;tcp://130.X.X.X:56866;tcp://172.40.10.1:56866;tcp://
>> 172.20.10.1:56866
>>
>> Throwing in some firewall debugging rules indicate that the compute
>> nodes were trying to talk back to mpirun on the front-end node over
>> the front-end node's public ip. Based on this, and looking at the
>> arguments passed above it seemed as though the public ip of the
>> front end node was being tried before any its private IPs, and the
>> delay I was seeing was orted waiting for the connection to the
>> front-end node's public ip to timeout before it tried it's cluster-
>> facing ip and the connection succeeded.
>>
>> I was able to work around this by specifying "--mca
>> oob_tcp_if_include bond0,eth0" to mpirun (the front-end node has 2
>> bonded nics as its cluster facing interface). When I provided that
>> argument the previously experienced delay disappeared. I could
>> easily put that into openmpi-mca-params.conf and be done with the
>> problem but I would like to know why openmpi chose to use the
>> public ip of the node before it's internal IP and if this is
>> expected behavior. I suspect that it may not be.
>>
>> -Aaron
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users