Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
From: Craig Tierney (Craig.Tierney_at_[hidden])
Date: 2008-10-14 17:39:51


Reuti wrote:
>
> Am 14.10.2008 um 23:18 schrieb Craig Tierney:
>
>> Ralph Castain wrote:
>>> I -think- there is...at least here, it does seem to behave that way
>>> on our systems. Not sure if there is something done locally to make
>>> it work.
>>> Also, though, I have noted that LD_LIBRARY_PATH does seem to be
>>> getting forwarded on the 1.3 branch in some environments. OMPI isn't
>>> doing it directly to the best of my knowledge, but I think the base
>>> environment might be. Specifically, I noticed it on slurm earlier
>>> today. I'll check the others as far as I can.
>>> Craig: what environment are you using? ssh?
>>> Ralph
>>
>> We are using ssh (we do not use tight integration in SGE).
>
> Hi Craig, may I ask why? You compiled Open MPI without SGE support, as
> in 1.2.7 it's in by default AFAIK? - Reuti
>
>

Only because we don't have it on. When we first started using
SGE around 2002, we hadn't used it. It is on our list of things to
do, but it is not trivial to just turn on and validate. We compiled all versions
of OpenMPI we have used (1.2.4,1.2.6, and 1.2.7) with --without-gridengine.

Craig

>>
>> Craig
>>
>>
>>
>>
>>> On Oct 14, 2008, at 1:18 PM, George Bosilca wrote:
>>>> I use modules too, but they only work locally. Or is there a feature
>>>> in "module" to automatically load the list of currently loaded local
>>>> modules remotely ?
>>>>
>>>> george.
>>>>
>>>> On Oct 14, 2008, at 3:03 PM, Ralph Castain wrote:
>>>>
>>>>> You might consider using something like "module" - we use that
>>>>> system for exactly this reason. Works quite well and solves the
>>>>> multiple compiler issue.
>>>>>
>>>>> Ralph
>>>>>
>>>>> On Oct 14, 2008, at 12:56 PM, Craig Tierney wrote:
>>>>>
>>>>>> George Bosilca wrote:
>>>>>>> The option to expand the remote LD_LIBRARY_PATH, in such a way
>>>>>>> that Open MPI related applications have their dependencies
>>>>>>> satisfied, is in the trunk. The fact that the compiler requires
>>>>>>> some LD_LIBRARY_PATH is out of the scope of an MPI
>>>>>>> implementation, and I don't think we should take care of it.
>>>>>>> Passing the local LD_LIBRARY_PATH to the remote nodes doesn't
>>>>>>> make much sense. There are plenty of environment, where the head
>>>>>>> node have a different configuration than the compute nodes.
>>>>>>> Again, in this case my original solution seems not that bad. If
>>>>>>> you copy (or make a link if you prefer) in the Open MPI lib
>>>>>>> directory to the compiler shared libraries, this will work.
>>>>>>> george.
>>>>>>
>>>>>> This does work. It just increases maintenance for each new version
>>>>>> of OpenMPI. How often does a head node have a different
>>>>>> configuration
>>>>>> than the compute node? It would see that this would even more
>>>>>> support the
>>>>>> passing of LD_LIBRARY_PATH for OpenMPI tools to support a
>>>>>> heterogeneous
>>>>>> configuration as you described.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Craig
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote:
>>>>>>>> George Bosilca wrote:
>>>>>>>>> Craig,
>>>>>>>>> This is a problem with the Intel libraries and not the Open MPI
>>>>>>>>> ones. You have to somehow make these libraries available on the
>>>>>>>>> compute nodes.
>>>>>>>>> What I usually do (but it's not the best way to solve this
>>>>>>>>> problem) is to copy these libraries somewhere on my home area
>>>>>>>>> and to add the directory to my LD_LIBRARY_PATH.
>>>>>>>>> george.
>>>>>>>>
>>>>>>>> This is ok when you only ever use one compiler, but it isn't
>>>>>>>> very flexible.
>>>>>>>> I want to keep it as simple as possible for my users, while
>>>>>>>> having a maintainable
>>>>>>>> system.
>>>>>>>>
>>>>>>>> The libraries are on the compute nodes, the problem deals with
>>>>>>>> supporting
>>>>>>>> multiple versions of compilers. I can't just list all of the
>>>>>>>> lib paths
>>>>>>>> in ld.so.conf, because then the user will never get the correct
>>>>>>>> one. I can't
>>>>>>>> specify a static LD_LIBRARY_PATH for the same reason. I would
>>>>>>>> prefer not
>>>>>>>> to build my system libraries static.
>>>>>>>>
>>>>>>>> To the OpenMPI developers, what is your opinion on changing
>>>>>>>> orterun/mpirun
>>>>>>>> to pass LD_LIBRARY_PATH to the remote hosts when starting
>>>>>>>> OpenMPI processes?
>>>>>>>> By hand, all that would be done is:
>>>>>>>>
>>>>>>>> env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted <args>
>>>>>>>>
>>>>>>>> This would ensure that orted is launched correctly.
>>>>>>>>
>>>>>>>> Or is it better to just build the OpenMPI tools statically? We
>>>>>>>> also
>>>>>>>> use other compilers (PGI, Lahey) so I need a solution that works
>>>>>>>> for
>>>>>>>> all of them.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Craig
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote:
>>>>>>>>>> I am having problems launching openmpi jobs on my system. I
>>>>>>>>>> support multiple versions
>>>>>>>>>> of MPI and compilers using GNU Modules. For the default
>>>>>>>>>> compiler, everything is fine.
>>>>>>>>>> For non-default, I am having problems.
>>>>>>>>>>
>>>>>>>>>> I built Openmpi-1.2.6 (and 1.2.7) with the following configure
>>>>>>>>>> options:
>>>>>>>>>>
>>>>>>>>>> # module load intel/10.1
>>>>>>>>>> # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort
>>>>>>>>>> --prefix=/opt/openmpi/1.2.7-intel-10.1 --without-
>>>>>>>>>> gridengine --enable-io-romio
>>>>>>>>>> --with-io-romio-flags=--with-file-sys=nfs+ufs
>>>>>>>>>> --with-openib=/opt/hjet/ofed/1.3.1
>>>>>>>>>>
>>>>>>>>>> When I launch a job, I run the module command for the right
>>>>>>>>>> compiler/MPI version to set the paths
>>>>>>>>>> correctly. Mpirun passes LD_LIBRARY_PATH to the executable I
>>>>>>>>>> am launching, but not orted.
>>>>>>>>>>
>>>>>>>>>> When orted is launched on the remote system, the LD_LIBRARY_PATH
>>>>>>>>>> doesn't come with, and the Intel 10.1 libraries can't be found.
>>>>>>>>>>
>>>>>>>>>> /opt/openmpi/1.2.7-intel-10.1/bin/orted: error while loading
>>>>>>>>>> shared libraries: libintlc.so.5: cannot open shared object
>>>>>>>>>> file: No such file or directory
>>>>>>>>>>
>>>>>>>>>> How do others solve this problem?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Craig
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Craig Tierney (craig.tierney_at_[hidden])
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> users_at_[hidden]
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> users_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Craig Tierney (craig.tierney_at_[hidden])
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> users_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Craig Tierney (craig.tierney_at_[hidden])
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Craig Tierney (craig.tierney_at_[hidden])
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Craig Tierney (craig.tierney_at_[hidden])