Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] simple test problem hangs on mpi_finalize and consumes all system resources
From: Gus Correa (gus_at_[hidden])
Date: 2014-01-24 13:26:50


On 01/24/2014 12:50 PM, Fischer, Greg A. wrote:
> Yep. That was the problem. It works beautifully now.
>
> Thanks for prodding me to take another look.
>
> With regards to openmpi-1.6.5, the system that I'm compiling and running on,
SLES10, contains some pretty dated software (e.g. Linux 2.6.x, python 2.4,
gcc 4.1.2). Is it possible there's simply an
incompatibility lurking in there somewhere that would trip
openmpi-1.6.5 but not openmpi-1.4.3?
>
> Greg
>

Hi Greg

FWIW, we have OpenMPI 1.6.5 installed
(and we have used OMPI 1.4.5, 1.4.4, 1.4.3, ..., 1.2.8, before)
in our older cluster that has CentOS 5.2, Linux kernel 2.6.18,
gcc 4.1.2, Python 2.4.3, etc.
Parallel programs compile and run with OMPI 1.6.5 without problems.

I hope this helps,
Gus Correa

>> -----Original Message-----
>> From: Fischer, Greg A.
>> Sent: Friday, January 24, 2014 11:41 AM
>> To: 'Open MPI Users'
>> Cc: Fischer, Greg A.
>> Subject: RE: [OMPI users] simple test problem hangs on mpi_finalize and
>> consumes all system resources
>>
>> Hmm... It looks like CMAKE was somehow finding openmpi-1.6.5 instead of
>> openmpi-1.4.3, despite the environment variables being set otherwise. This
>> is likely the explanation. I'll try to chase that down.
>>
>>> -----Original Message-----
>>> From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Jeff
>>> Squyres (jsquyres)
>>> Sent: Friday, January 24, 2014 11:39 AM
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] simple test problem hangs on mpi_finalize and
>>> consumes all system resources
>>>
>>> Ok. I only mention this because the "mca_paffinity_linux.so: undefined
>>> symbol: mca_base_param_reg_int" type of message is almost always an
>>> indicator of two different versions being installed into the same tree.
>>>
>>>
>>> On Jan 24, 2014, at 11:26 AM, "Fischer, Greg A."
>>> <fischega_at_[hidden]> wrote:
>>>
>>>> Version 1.4.3 and 1.6.5 were and are installed in separate trees:
>>>>
>>>> 1003 fischega_at_lxlogin2[~]> ls
>>>> /tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.*
>>>> /tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.4.3:
>>>> bin etc include lib share
>>>>
>>>> /tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.6.5:
>>>> bin etc include lib share
>>>>
>>>> I'm fairly sure I was careful to check that the LD_LIBRARY_PATH was
>>>> set
>>> correctly, but I'll check again.
>>>>
>>>>> -----Original Message-----
>>>>> From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Jeff
>>>>> Squyres (jsquyres)
>>>>> Sent: Friday, January 24, 2014 11:07 AM
>>>>> To: Open MPI Users
>>>>> Subject: Re: [OMPI users] simple test problem hangs on mpi_finalize
>>>>> and consumes all system resources
>>>>>
>>>>> On Jan 22, 2014, at 10:21 AM, "Fischer, Greg A."
>>>>> <fischega_at_[hidden]> wrote:
>>>>>
>>>>>> The reason for deleting the openmpi-1.6.5 installation was that I
>>>>>> went back
>>>>> and installed openmpi-1.4.3 and the problem (mostly) went away.
>>>>> Openmpi-
>>>>> 1.4.3 can run the simple tests without issue, but on my "real"
>>>>> program, I'm getting symbol lookup errors:
>>>>>>
>>>>>> mca_paffinity_linux.so: undefined symbol: mca_base_param_reg_int
>>>>>
>>>>> This sounds like you are mixing 1.6.x and 1.4.x in the same
>>>>> installation
>>> tree.
>>>>> This can definitely lead to sadness.
>>>>>
>>>>> More specifically: installing 1.6 over an existing 1.4 installation
>>>>> (and vice
>>>>> versa) is definitely NOT supported. The set of plugins that the two
>>>>> install are different, and can lead to all manner of weird/undefined
>>> behavior.
>>>>>
>>>>> FWIW: I typically install Open MPI into a tree by itself. And if I
>>>>> later want to remove that installation, I just "rm -rf" that tree.
>>>>> Then I can install a different version of OMPI into that same tree
>>>>> (because the prior tree is completely gone).
>>>>>
>>>>> However, if you can't install OMPI into a tree by itself, you can
>>>>> "make uninstall" from the source tree, and that should surgically
>>>>> completely remove OMPI from the installation tree. Then it is safe
>>>>> to install a different version of OMPI into that same tree.
>>>>>
>>>>> Can you verify that you had installed OMPI into completely clean
>>>>> trees? If you didn't, I can imagine that causing the kinds of
>>>>> errors that you
>>> described.
>>>>>
>>>>> --
>>>>> Jeff Squyres
>>>>> jsquyres_at_[hidden]
>>>>> For corporate legal information go to:
>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquyres_at_[hidden]
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users