Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] simple test problem hangs on mpi_finalize and consumes all system resources
From: Fischer, Greg A. (fischega_at_[hidden])
Date: 2014-01-24 11:41:19


Hmm... It looks like CMAKE was somehow finding openmpi-1.6.5 instead of openmpi-1.4.3, despite the environment variables being set otherwise. This is likely the explanation. I'll try to chase that down.

>-----Original Message-----
>From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Jeff
>Squyres (jsquyres)
>Sent: Friday, January 24, 2014 11:39 AM
>To: Open MPI Users
>Subject: Re: [OMPI users] simple test problem hangs on mpi_finalize and
>consumes all system resources
>
>Ok. I only mention this because the "mca_paffinity_linux.so: undefined
>symbol: mca_base_param_reg_int" type of message is almost always an
>indicator of two different versions being installed into the same tree.
>
>
>On Jan 24, 2014, at 11:26 AM, "Fischer, Greg A."
><fischega_at_[hidden]> wrote:
>
>> Version 1.4.3 and 1.6.5 were and are installed in separate trees:
>>
>> 1003 fischega_at_lxlogin2[~]> ls
>> /tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.*
>> /tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.4.3:
>> bin etc include lib share
>>
>> /tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.6.5:
>> bin etc include lib share
>>
>> I'm fairly sure I was careful to check that the LD_LIBRARY_PATH was set
>correctly, but I'll check again.
>>
>>> -----Original Message-----
>>> From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Jeff
>>> Squyres (jsquyres)
>>> Sent: Friday, January 24, 2014 11:07 AM
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] simple test problem hangs on mpi_finalize
>>> and consumes all system resources
>>>
>>> On Jan 22, 2014, at 10:21 AM, "Fischer, Greg A."
>>> <fischega_at_[hidden]> wrote:
>>>
>>>> The reason for deleting the openmpi-1.6.5 installation was that I
>>>> went back
>>> and installed openmpi-1.4.3 and the problem (mostly) went away.
>>> Openmpi-
>>> 1.4.3 can run the simple tests without issue, but on my "real"
>>> program, I'm getting symbol lookup errors:
>>>>
>>>> mca_paffinity_linux.so: undefined symbol: mca_base_param_reg_int
>>>
>>> This sounds like you are mixing 1.6.x and 1.4.x in the same installation
>tree.
>>> This can definitely lead to sadness.
>>>
>>> More specifically: installing 1.6 over an existing 1.4 installation
>>> (and vice
>>> versa) is definitely NOT supported. The set of plugins that the two
>>> install are different, and can lead to all manner of weird/undefined
>behavior.
>>>
>>> FWIW: I typically install Open MPI into a tree by itself. And if I
>>> later want to remove that installation, I just "rm -rf" that tree.
>>> Then I can install a different version of OMPI into that same tree
>>> (because the prior tree is completely gone).
>>>
>>> However, if you can't install OMPI into a tree by itself, you can
>>> "make uninstall" from the source tree, and that should surgically
>>> completely remove OMPI from the installation tree. Then it is safe
>>> to install a different version of OMPI into that same tree.
>>>
>>> Can you verify that you had installed OMPI into completely clean
>>> trees? If you didn't, I can imagine that causing the kinds of errors that you
>described.
>>>
>>> --
>>> Jeff Squyres
>>> jsquyres_at_[hidden]
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>--
>Jeff Squyres
>jsquyres_at_[hidden]
>For corporate legal information go to:
>http://www.cisco.com/web/about/doing_business/legal/cri/
>
>_______________________________________________
>users mailing list
>users_at_[hidden]
>http://www.open-mpi.org/mailman/listinfo.cgi/users
>