Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem launching onto Bourne shell
From: Hahn Kim (hgk_at_[hidden])
Date: 2008-10-07 12:48:14


Thanks for the feedback.

Regarding 1., we're actually using 1.2.5. We started using Open MPI
last winter and just stuck with it. For now, using the -x flag with
mpirun works. If this really is a bug in 1.2.7, then I think we'll
stick with 1.2.5 for now, then upgrade later when it's fixed.

Regarding 2., are you saying I should run the commands you suggest
from the x86 node running bash, so that ssh logs into the Cell node
running Bourne?

When I run "ssh othernode env" from the x86 node, I get the following
vanilla environment:

USER=ha17646
HOME=/home/ha17646
LOGNAME=ha17646
SHELL=/bin/sh
PWD=/home/ha17646

When I run "ssh othernode" from the x86 node, then run "env" on the
Cell, I get the following:

USER=ha17646
LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32
HOME=/home/ha17646
MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key
LOGNAME=ha17646
TERM=xterm-color
PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/
tools/cmake-2.4.7/bin:/tools
SHELL=/bin/sh
PWD=/home/ha17646
TZ=EST5EDT

Hahn

On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote:

> Ralph and I just talked about this a bit:
>
> 1. In all released versions of OMPI, we *do* source the .profile file
> on the target node if it exists (because vanilla Bourne shells do not
> source anything on remote nodes -- Bash does, though, per the FAQ).
> However, looking in 1.2.7, it looks like it might not be executing
> that code -- there *may* be a bug in this area. We're checking into
> it.
>
> 2. You might want to check your configuration to see if your .bashrc
> is dumping out early because it's a non-interactive shell. Check the
> output of:
>
> ssh othernode env
> vs.
> ssh othernode
> env
>
> (i.e., a non-interactive running of "env" vs. an interactive login and
> running "env")
>
>
>
> On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote:
>
>> I am unaware of anything in the code that would "source .profile"
>> for you. I believe the FAQ page is in error here.
>>
>> Ralph
>>
>> On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote:
>>
>>> Great, that worked, thanks! However, it still concerns me that the
>>> FAQ page says that mpirun will execute .profile which doesn't seem
>>> to work for me. Are there any configuration issues that could
>>> possibly be preventing mpirun from doing this? It would certainly
>>> be more convenient if I could maintain my environment in a
>>> single .profile file instead of adding what could potentially be a
>>> lot of -x arguments to my mpirun command.
>>>
>>> Hahn
>>>
>>> On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote:
>>>
>>>> tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As
>>>> an
>>>> alternative you can set specific values with mpirun -x
>>>> LD_LIBRARY_PATH=/some/where:/some/where/else . More information
>>>> with
>>>> mpirun --help (or man mpirun).
>>>>
>>>> Aurelien
>>>>
>>>>
>>>>
>>>> Le 6 oct. 08 à 16:06, Hahn Kim a écrit :
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm having difficulty launching an Open MPI job onto a machine
>>>>> that
>>>>> is running the Bourne shell.
>>>>>
>>>>> Here's my basic setup. I have two machines, one is an x86-based
>>>>> machine running bash and the other is a Cell-based machine running
>>>>> Bourne shell. I'm running mpirun from the x86 machine, which
>>>>> launches a C++ MPI application onto the Cell machine. I get the
>>>>> following error:
>>>>>
>>>>> error while loading shared libraries: libstdc++.so.6: cannot open
>>>>> shared object file: No such file or directory
>>>>>
>>>>> The basic problem is that LD_LIBRARY_PATH needs to be set to the
>>>>> directory that contains libstdc++.so.6 for the Cell. I set the
>>>>> following line in .profile:
>>>>>
>>>>> export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32
>>>>>
>>>>> which is the path to the PPC libraries for Cell.
>>>>>
>>>>> Now if I log directly into the Cell machine and run the program
>>>>> directly from the command line, I don't get the above error. But
>>>>> mpirun still fails, even after setting LD_LIBRARY_PATH
>>>>> in .profile.
>>>>>
>>>>> As a sanity check, I did the following. I ran the following
>>>>> command
>>>>> from the x86 machine:
>>>>>
>>>>> mpirun -np 1 --host cab0 env
>>>>>
>>>>> which, among others things, shows me the following value:
>>>>>
>>>>> LD_LIBRARY_PATH=/tools/openmpi-1.2.5/lib:
>>>>>
>>>>> If I log into the Cell machine and run env directly from the
>>>>> command
>>>>> line, I get the following value:
>>>>>
>>>>> LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32
>>>>>
>>>>> So it appears that .profile gets sourced when I log in but not
>>>>> when
>>>>> mpirun runs.
>>>>>
>>>>> However, according to the OpenMPI FAQ (http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path
>>>>> ), mpirun is supposed to directly call .profile since Bourne shell
>>>>> doesn't automatically call it for non-interactive shells.
>>>>>
>>>>> Does anyone have any insight as to why my environment isn't being
>>>>> set properly? Thanks!
>>>>>
>>>>> Hahn
>>>>>
>>>>> --
>>>>> Hahn Kim, hgk_at_[hidden]
>>>>> MIT Lincoln Laboratory
>>>>> 244 Wood St., Lexington, MA 02420
>>>>> Tel: 781-981-0940, Fax: 781-981-5255
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>>
>>>> --
>>>> * Dr. Aurélien Bouteiller
>>>> * Sr. Research Associate at Innovative Computing Laboratory
>>>> * University of Tennessee
>>>> * 1122 Volunteer Boulevard, suite 350
>>>> * Knoxville, TN 37996
>>>> * 865 974 6321
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>>> --
>>> Hahn Kim
>>> MIT Lincoln Laboratory Phone: (781) 981-0940
>>> 244 Wood Street, S2-252 Fax: (781) 981-5255
>>> Lexington, MA 02420 E-mail: hgk_at_[hidden]
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Hahn Kim, hgk_at_[hidden]
MIT Lincoln Laboratory
244 Wood St., Lexington, MA 02420
Tel: 781-981-0940, Fax: 781-981-5255