Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] shared libraries issue compiling 1.3.1/intel 10.1.022
From: Francesco Pietra (chiendarret_at_[hidden])
Date: 2009-04-14 14:53:26


mpirun -x LD_LIBRARY_PATH -host tya64 connectivity_c

complained about libimf.so (not found), just the same as without "-x
LD_LIBRARY_PATH" (tried to give the full path to the PATH with same
error)

while

# dpkg --search libimf.so
/opt/intel/fce/10.1.022/lib/libimf.so
/opt/intel/fce/10.1.022/lib/libimf.so

All above on a tyan S2895 with opteron (debian amd64 lenny). On the
same motherboard and OS, a cross compilation gcc g++ ifort was
successful as to the connectivity (and hello) tests.
-------------

On a Supermicro 4-socket opteron (same OS) even the cross compilation
failed. In contrast, a gcc g** gfortran compilation was succesfull as
to the connectivity (and hello) tests, however gfortran is not capable
to compile the faster code of the suite I am interested in (Amber10).
--------------
I came across what follows

"dynamic linkage is also a headache in that the mechanisms
used to find shared libraries during dynamic loading are not all that robust
on Linux systems running MPICH or other MPI packages
.................... for the compilers that use compiler shared
libraries (ifort, pathscale), we use LD_LIBRARY_PATH during
configuration to set an -rpath
linkage option, which is reliably available in the executable."

Does that mean adding as a flag

-rpath=LD_LIBRARY_PATH

when compiling both openmpi and amber? I can't find examples as to the
correct syntax.

thanks
francesco

On Fri, Apr 10, 2009 at 6:27 PM, Mostyn Lewis <Mostyn.Lewis_at_[hidden]> wrote:
> If you want to find libimf.so, which is a shared INTEL library,
> pass the library path with a -x on mpirun
>
> mpirun .... -x LD_LIBRARY_PATH ....
>
> DM
>
>
> On Fri, 10 Apr 2009, Francesco Pietra wrote:
>
>> Hi Gus:
>>
>> If you feel that the observations below are not relevant to openmpi,
>> please disregard the message. You have already kindly devoted so much
>> time to my problems.
>>
>> The "limits.h" issue is solved with 10.1.022 intel compilers: as I
>> felt, the problem was with the pre-10.1.021 version of the intel C++
>> and ifort compilers, a subtle bug observed also by gentoo people (web
>> intel). There remains an orted issue.
>>
>> The openmpi 1.3.1 installation was able to compile connectivity_c.c
>> and hello_c.c, though, running mpirun (output below between ===):
>>
>> =================
>> /usr/local/bin/mpirun -host -n 4 connectivity_c 2>&1 | tee
>> connectivity.out
>> /usr/local/bin/orted: error while loading shared libraries: libimf.so:
>> cannot open shared object file: No such file or directory
>> --------------------------------------------------------------------------
>> A daemon (pid 8472) died unexpectedly with status 127 while attempting
>> to launch so we are aborting.
>>
>> There may be more information reported by the environment (see above).
>>
>> This may be because the daemon was unable to find all the needed shared
>> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
>> location of the shared libraries on the remote nodes and this will
>> automatically be forwarded to the remote nodes.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> mpirun noticed that the job aborted, but has no info as to the process
>> that caused that situation.
>> --------------------------------------------------------------------------
>> mpirun: clean termination accomplished
>> =============
>>
>> At this point, Amber10 serial compiled nicely (all intel, like
>> openmpi), but parallel compilation, as expected, returned the same
>> problem above:
>>
>> =================
>> export TESTsander=/usr/local/amber10/exe/sander.MPI; make
>> test.sander.BASIC
>> make[1]: Entering directory `/usr/local/amber10/test'
>> cd cytosine && ./Run.cytosine
>> orted: error while loading shared libraries: libimf.so: cannot open
>> shared object file: No such file or directory
>> --------------------------------------------------------------------------
>> A daemon (pid 8371) died unexpectedly with status 127 while attempting
>> to launch so we are aborting.
>>
>> There may be more information reported by the environment (see above).
>>
>> This may be because the daemon was unable to find all the needed shared
>> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
>> location of the shared libraries on the remote nodes and this will
>> automatically be forwarded to the remote nodes.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> mpirun noticed that the job aborted, but has no info as to the process
>> that caused that situation.
>> --------------------------------------------------------------------------
>> mpirun: clean termination accomplished
>>
>>  ./Run.cytosine:  Program error
>> make[1]: *** [test.sander.BASIC] Error 1
>> make[1]: Leaving directory `/usr/local/amber10/test'
>> make: *** [test.sander.BASIC.MPI] Error 2
>> =====================
>>
>> Relevant info:
>>
>> The daemon was not ssh (thus my hypothesis that a firewall on the
>> router was killing ssh is not the case). During these procedures,
>> there were only deb64 and deb32 on the local network. On monoprocessor
>> deb32 (i386) there is nothing of openmpi or amber. Only ssh. Thus, my
>> .bashrc on deb32 can't correspond to that of deb 64 as far as
>> libraries are concerned.
>>
>> echo $LD_LIBRARY_PATH
>>
>> /opt/intel/mkl/10.1.2.024/lib/em64t:/opt/intel/cce/10.1..022/lib:/opt/intel/fce/10.1.022/lib:/usr/local/lib
>>
>> # dpkg --search libimf.so
>> intel-iforte101022: /opt/intel/fce/10.1.022/lib/libimf.so
>> intel-icce101022: /opt/intel/cce/10.1.022/lib/libimf.so
>>
>> i.e., libimf.so is on the unix path, still not found by mpirun.
>>
>> Before compiling I trie to carefully check all env variables and
>> paths. In particular, as to mpi:
>>
>> mpif90 -show /opt/intel/fce/10.1.022//bin/ifort -I/usr/local/include
>> -pthread -I/usr/local/lib -L/usr/local/lib -lmpi_f90 -lmpi_f77 -lmpi
>> -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil
>>
>> thanks
>> francesco
>>
>>
>>
>> On Thu, Apr 9, 2009 at 9:29 PM, Gus Correa <gus_at_[hidden]> wrote:
>>>
>>> Hi Francesco
>>>
>>> Francesco Pietra wrote:
>>>>
>>>> Hi:
>>>> As failure to find "limits.h" in my attempted compilations of Amber of
>>>> th fast few days (amd64 lenny, openmpi 1.3.1, intel compilers
>>>> 10.1.015) is probably (or I hope so) a bug of the version used of
>>>> intel compilers (I made with debian the same observations reported
>>>> for gentoo,
>>>> http://software.intel.com/en-us/forums/intel-c-compiler/topic/59886/).
>>>>
>>>> I made a deb package of 10.1.022, icc and ifort.
>>>>
>>>> ./configure CC icc, CXX icp,
>>>
>>> The Intel C++ compiler is called icpc, not icp.
>>> Is this a typo on your message, or on the actual configure options?
>>>
>>> F77 and FC ifort --with-libnuma=/usr (not
>>>>
>>>> /usr/lib so that the numa.h issue is not raised), "make clean",
>>>
>>> If you really did "make clean" you may have removed useful things.
>>> What did you do, "make" or "make clean"?
>>>
>>> and
>>>>
>>>> "mak install" gave no error signals. However, the compilation tests in
>>>> the examples did not pass and I really don't understand why.
>>>>
>>>
>>> Which compilation tests you are talking about?
>>> From Amber or from the OpenMPI example programs (connectivity_c and
>>> hello_c), or both?
>>>
>>>> The error, with both connectivity_c and hello_c (I was operating on
>>>> the parallel computer deb64 directly and have access to everything
>>>> there) was failure to find a shared library, libimf.so
>>>>
>>>
>>> To get the right Intel environment,
>>> you need to put these commands inside your login files
>>> (.bashrc or .cshrc), to setup the Intel environment variables correctly:
>>>
>>> source /path/to/your/intel/cce/bin/iccvars.sh
>>> source /path/to/your/intel/cce/bin/ifortvars.sh
>>>
>>> and perhaps a similar one for mkl.
>>> (I don't use MKL, I don't know much about it).
>>>
>>> If your home directory is NFS mounted to all the computers you
>>> use to run parallel programs,
>>> then the same .bashrc/.csrhc will work on all computers.
>>> However, if you have a separate home directory on each computer,
>>> then you need to do this on each of them.
>>> I.e., you have to include the "source" commands above
>>> in the .bashrc/.cshrc files on your home directory in EACH computer.
>>>
>>> Also I presume you use bash/sh not tcsh/csh, right?
>>> Otherwise you need to source iccvars.csh instead of iccvars.sh.
>>>
>>>
>>>> # dpkg --search libimf.so
>>>>   /opt/intel/fce/10.1.022/lib/libimf.so  (the same for cce)
>>>>
>>>> which path seems to be correctly sourced by iccvars.sh and
>>>> ifortvars.sh (incidentally, both files are -rw-r--r-- root root;
>>>> correct that non executable?)
>>>>
>>>
>>> The permissions here are not a problem.
>>> You are supposed to *source* the files, not to execute them.
>>> If you execute them instead of sourcing the files,
>>> your Intel environment will *NOT* be setup.
>>>
>>> BTW, the easy way to check your environment is to type "env" on the
>>> shell command prompt.
>>>
>>>> echo $LD_LIBRARY_PATH
>>>> returned, inter alia,
>>>>
>>>>
>>>> /opt/intel/mkl/10.1.2.024/lib/em64t:/opt/intel/mkl/10.1.2.024/lib/em64t:/opt/intel/cce/10.1.022/lib:/opt/intel/fce/10.1.022/lib
>>>> (why twice the mkl?)
>>>>
>>>
>>> Hard to tell in which computer you were when you did this,
>>> and hence what it should affect.
>>>
>>> You man have sourced twice the mkl shell that sets up the MKL environment
>>> variables, which would write its library path more than
>>> once.
>>>
>>> When the environment variables get this much confused,
>>> with duplicate paths and so on, you may want to logout
>>> and login again, to start fresh.
>>>
>>> Do you need MKL for Amber?
>>> If you don't use it, keep things simple, and don't bother about it.
>>>
>>>
>>>> I surely miss to understand something fundamental. Hope other eyes see
>>>> better
>>>>
>>>
>>> Jody helped you run the hello_c program successfully.
>>> Try to repeat carefully the same steps.
>>> You should get the same result,
>>> the OpenMPI test programs should run.
>>>
>>>> A kind person elsewhere suggested me on passing "The use of -rpath
>>>> during linking is highly recommended as opposed to setting
>>>> LD_LIBRARY_PATH at run time, not the least because it hardcodes the
>>>> paths to the "right" library files in the executables themselves"
>>>> Should this be relevant to the present issue, where to learn about
>>>> -rpath linking?
>>>>
>>>
>>> If you are talking about Amber,
>>> you would have to tweak the Makefiles to tweak the linker -rpath.
>>> And we don't know much about Amber's Makefiles,
>>> so this may be a very tricky approach.
>>>
>>> If you are talking about the OpenMPI test programs,
>>> I think it is just a matter of setting the Intel environment variables
>>> right, sourcing the ifortvars.sh iccvars.sh properly,
>>> to get the right runtime LD_LIBRARY_PATH.
>>>
>>>> thanks
>>>> francesco pietra
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> I hope this helps.
>>> Gus Correa
>>>
>>> ---------------------------------------------------------------------
>>> Gustavo Correa
>>> Lamont-Doherty Earth Observatory - Columbia University
>>> Palisades, NY, 10964-8000 - USA
>>> ---------------------------------------------------------------------
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>