Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem launching onto Bourne shell
From: Hahn Kim (hgk_at_[hidden])
Date: 2008-10-07 16:19:33


> you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and
> possibly others, such as that LICENSE key, etc.) regardless of
> whether it's an interactive or non-interactive login.

Right, that's exactly what I want to do. I was hoping that mpirun
would run .profile as the FAQ page stated, but the -x fix works for now.

I just realized that I'm using .bash_profile on the x86 and need to
move its contents into .bashrc and call .bashrc from .bash_profile,
since eventually I will also be launching MPI jobs onto other x86
processors.

Thanks to everyone for their help.

Hahn

On Oct 7, 2008, at 2:16 PM, Jeff Squyres wrote:

> On Oct 7, 2008, at 12:48 PM, Hahn Kim wrote:
>
>> Regarding 1., we're actually using 1.2.5. We started using Open MPI
>> last winter and just stuck with it. For now, using the -x flag with
>> mpirun works. If this really is a bug in 1.2.7, then I think we'll
>> stick with 1.2.5 for now, then upgrade later when it's fixed.
>
> It looks like this behavior has been the same throughout the entire
> 1.2 series.
>
>> Regarding 2., are you saying I should run the commands you suggest
>> from the x86 node running bash, so that ssh logs into the Cell node
>> running Bourne?
>
> I'm saying that if "ssh othernode env" gives different answers than
> "ssh othernode"/"env", then your .bashrc or .profile or whatever is
> dumping out early depending on whether you have an interactive login
> or not. This is the real cause of the error -- you probably want to
> set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such
> as that LICENSE key, etc.) regardless of whether it's an interactive
> or non-interactive login.
>
>>
>> When I run "ssh othernode env" from the x86 node, I get the
>> following vanilla environment:
>>
>> USER=ha17646
>> HOME=/home/ha17646
>> LOGNAME=ha17646
>> SHELL=/bin/sh
>> PWD=/home/ha17646
>>
>> When I run "ssh othernode" from the x86 node, then run "env" on the
>> Cell, I get the following:
>>
>> USER=ha17646
>> LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32
>> HOME=/home/ha17646
>> MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key
>> LOGNAME=ha17646
>> TERM=xterm-color
>> PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/
>> tools/cmake-2.4.7/bin:/tools
>> SHELL=/bin/sh
>> PWD=/home/ha17646
>> TZ=EST5EDT
>>
>> Hahn
>>
>> On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote:
>>
>>> Ralph and I just talked about this a bit:
>>>
>>> 1. In all released versions of OMPI, we *do* source the .profile
>>> file
>>> on the target node if it exists (because vanilla Bourne shells do
>>> not
>>> source anything on remote nodes -- Bash does, though, per the FAQ).
>>> However, looking in 1.2.7, it looks like it might not be executing
>>> that code -- there *may* be a bug in this area. We're checking
>>> into it.
>>>
>>> 2. You might want to check your configuration to see if your .bashrc
>>> is dumping out early because it's a non-interactive shell. Check
>>> the
>>> output of:
>>>
>>> ssh othernode env
>>> vs.
>>> ssh othernode
>>> env
>>>
>>> (i.e., a non-interactive running of "env" vs. an interactive login
>>> and
>>> running "env")
>>>
>>>
>>>
>>> On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote:
>>>
>>>> I am unaware of anything in the code that would "source .profile"
>>>> for you. I believe the FAQ page is in error here.
>>>>
>>>> Ralph
>>>>
>>>> On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote:
>>>>
>>>>> Great, that worked, thanks! However, it still concerns me that
>>>>> the
>>>>> FAQ page says that mpirun will execute .profile which doesn't seem
>>>>> to work for me. Are there any configuration issues that could
>>>>> possibly be preventing mpirun from doing this? It would certainly
>>>>> be more convenient if I could maintain my environment in a
>>>>> single .profile file instead of adding what could potentially be a
>>>>> lot of -x arguments to my mpirun command.
>>>>>
>>>>> Hahn
>>>>>
>>>>> On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote:
>>>>>
>>>>>> tYou can forward your local env with mpirun -x LD_LIBRARY_PATH.
>>>>>> As
>>>>>> an
>>>>>> alternative you can set specific values with mpirun -x
>>>>>> LD_LIBRARY_PATH=/some/where:/some/where/else . More information
>>>>>> with
>>>>>> mpirun --help (or man mpirun).
>>>>>>
>>>>>> Aurelien
>>>>>>
>>>>>>
>>>>>>
>>>>>> Le 6 oct. 08 à 16:06, Hahn Kim a écrit :
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm having difficulty launching an Open MPI job onto a machine
>>>>>>> that
>>>>>>> is running the Bourne shell.
>>>>>>>
>>>>>>> Here's my basic setup. I have two machines, one is an x86-based
>>>>>>> machine running bash and the other is a Cell-based machine
>>>>>>> running
>>>>>>> Bourne shell. I'm running mpirun from the x86 machine, which
>>>>>>> launches a C++ MPI application onto the Cell machine. I get the
>>>>>>> following error:
>>>>>>>
>>>>>>> error while loading shared libraries: libstdc++.so.6: cannot
>>>>>>> open
>>>>>>> shared object file: No such file or directory
>>>>>>>
>>>>>>> The basic problem is that LD_LIBRARY_PATH needs to be set to the
>>>>>>> directory that contains libstdc++.so.6 for the Cell. I set the
>>>>>>> following line in .profile:
>>>>>>>
>>>>>>> export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32
>>>>>>>
>>>>>>> which is the path to the PPC libraries for Cell.
>>>>>>>
>>>>>>> Now if I log directly into the Cell machine and run the program
>>>>>>> directly from the command line, I don't get the above error.
>>>>>>> But
>>>>>>> mpirun still fails, even after setting LD_LIBRARY_PATH
>>>>>>> in .profile.
>>>>>>>
>>>>>>> As a sanity check, I did the following. I ran the following
>>>>>>> command
>>>>>>> from the x86 machine:
>>>>>>>
>>>>>>> mpirun -np 1 --host cab0 env
>>>>>>>
>>>>>>> which, among others things, shows me the following value:
>>>>>>>
>>>>>>> LD_LIBRARY_PATH=/tools/openmpi-1.2.5/lib:
>>>>>>>
>>>>>>> If I log into the Cell machine and run env directly from the
>>>>>>> command
>>>>>>> line, I get the following value:
>>>>>>>
>>>>>>> LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32
>>>>>>>
>>>>>>> So it appears that .profile gets sourced when I log in but not
>>>>>>> when
>>>>>>> mpirun runs.
>>>>>>>
>>>>>>> However, according to the OpenMPI FAQ (http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path
>>>>>>> ), mpirun is supposed to directly call .profile since Bourne
>>>>>>> shell
>>>>>>> doesn't automatically call it for non-interactive shells.
>>>>>>>
>>>>>>> Does anyone have any insight as to why my environment isn't
>>>>>>> being
>>>>>>> set properly? Thanks!
>>>>>>>
>>>>>>> Hahn
>>>>>>>
>>>>>>> --
>>>>>>> Hahn Kim, hgk_at_[hidden]
>>>>>>> MIT Lincoln Laboratory
>>>>>>> 244 Wood St., Lexington, MA 02420
>>>>>>> Tel: 781-981-0940, Fax: 781-981-5255
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> * Dr. Aurélien Bouteiller
>>>>>> * Sr. Research Associate at Innovative Computing Laboratory
>>>>>> * University of Tennessee
>>>>>> * 1122 Volunteer Boulevard, suite 350
>>>>>> * Knoxville, TN 37996
>>>>>> * 865 974 6321
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Hahn Kim
>>>>> MIT Lincoln Laboratory Phone: (781) 981-0940
>>>>> 244 Wood Street, S2-252 Fax: (781) 981-5255
>>>>> Lexington, MA 02420 E-mail: hgk_at_[hidden]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> --
>>> Jeff Squyres
>>> Cisco Systems
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> --
>> Hahn Kim, hgk_at_[hidden]
>> MIT Lincoln Laboratory
>> 244 Wood St., Lexington, MA 02420
>> Tel: 781-981-0940, Fax: 781-981-5255
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Hahn Kim, hgk_at_[hidden]
MIT Lincoln Laboratory
244 Wood St., Lexington, MA 02420
Tel: 781-981-0940, Fax: 781-981-5255