Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] mpirun does not propagate environment from master node to slave nodes
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-06-28 18:10:18


On Jun 28, 2011, at 3:52 PM, yanyg_at_[hidden] wrote:

> Thanks, Ralph!
>
> a) Yes, I know I could use only IB by "--mca btl openib", but just
> want to make sure I am using IB interfaces. I am seeking an option
> to mpirun to print out the actual interconnect protocol, like --prot to
> mpirun in MPICH2.

Afraid it doesn't exist - OMPI will -only- use the specified interfaces and will abort if it can't connect processes across at least one of them.

>
> b) Yes, my default shell is bash, but I run a c-shell script from bash
> terminal, mpirun is invoked inside this c-shell script. I am using rsh
> launcher, exactly as your guess. I try different mpirun command in
> the c-shell, one of them is
>
> /path/to/bin/mpirun --mca btl openib --app appfile
>
> and mpirun and orted are under /path/to/bin, and necessary libs are
> under /path/to/lib. I tried the -x, --prefix, and -path, all does not work
> as expected to propagate the PATH and LD_LIBRARY_PATH,
> since orted is not found on slave nodes, although it shoud since it
> on the shared NFS partition.

I suspect the code is getting confused by the different shells. I've seen other reports of this, and have observed it myself - suggest you avoid using the c-shell and launch from your default shell. I know that works.

>
> Thanks,
> Yiguang
>
>
> On Jun 28, 2011, at 9:05 AM, yanyg_at_[hidden] wrote:
>
>> Hello All,
>>
>> I installed Open MPI 1.4.3 on our new HPC blades, with Infiniband
>> interconnection.
>>
>> My system environments are as:
>>
>> 1)uname -a output:
>> Linux gulftown 2.6.18-194.el5 #1 SMP Tue Mar 16 21:52:39 EDT
>> 2010 x86_64 x86_64 x86_64 GNU/Linux
>>
>> 2) /home is mounted over all nodes, and mpirun is started under
>> /home/...
>>
>> Open MPI and application codes are compiled with intel(R)
>> compilers V11. Infiniband stack is Mellanox OFED 1.5.2.
>>
>> I have two questions about mpirun:
>>
>> a) how could I get to know what is the network interconnect
>> protocol used by the MPI application?
>>
>> I specify "--mca btl openib,self,sm,tcp" to mpirun, but I want to
>> make sure it really uses infiniband interconnect.
>
> Why specify tcp if you don't want it used? Just leave that off and it
> will have no choice but to use IB.
>
>
>
>>
>> b) when I run mpirun, I get the following message:
>
>> It seems orted is not found on slave nodes. If I set the PATH and
>> LD_LIBRARY_PATH through --prefix to mpirun, or --path, or -x
>> options to mpirun, to make the orted and related dynamic libs
>> available on slave nodes, it does not work as expected from
> mpirun
>> manual page. The only working case is that I set PATH and
>> LD_LIBRARY_PATH in ~/.bashrc for mpirun, and this .bashrc is
>> invoked by slave nodes too for login shell. I do not want to set
> PATH
>> and LD_LIBRARY_PATH in ~/.bashrc, but instead to set options
> to
>> mpirun directly.
>
> Should work with either prefix or -x options, assuming the right
> syntax with the latter.
>
> I take it your default shell is bash, and that you are using the rsh
> launcher (as opposed to something like torque)? Are you launching
> from your default shell, or did you perhaps change shell?
>
> Can you send the actual mpirun command you typed?
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users