Well, this is a little strange. The hanging behavior is gone, but I'm getting a segfault now. The output of "hello_c.c" and "ring_c.c" are attached.
I'm getting a segfault with the Fortran test, also. I'm afraid I may have polluted the experiment by removing the target openmpi-1.6.5 installation directory yesterday. To produce the attached outputs, I just went back and did "make install" in the openmpi-1.6.5 build directory. I've re-set the environment variables as they were a few days ago by sourcing the same bash script. Perhaps I forgot something, or something on the system changed? Regardless, LD_LIBRARY_PATH and PATH are set correctly, and aberrant behavior persists.
The reason for deleting the openmpi-1.6.5 installation was that I went back and installed openmpi-1.4.3 and the problem (mostly) went away. Openmpi-1.4.3 can run the simple tests without issue, but on my "real" program, I'm getting symbol lookup errors:
mca_paffinity_linux.so: undefined symbol: mca_base_param_reg_int
Perhaps that's a separate thread.
>From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Jeff
>Sent: Tuesday, January 21, 2014 3:57 PM
>To: Open MPI Users
>Subject: Re: [OMPI users] simple test problem hangs on mpi_finalize and
>consumes all system resources
>Just for giggles, can you repeat the same test but with hello_c.c and ring_c.c?
>I.e., let's get the Fortran out of the way and use just the base C bindings, and
>see what happens.
>On Jan 19, 2014, at 6:18 PM, "Fischer, Greg A." <fischega_at_[hidden]>
>> I just tried running "hello_f90.f90" and see the same behavior: 100% CPU
>usage, gradually increasing memory consumption, and failure to get past
>mpi_finalize. LD_LIBRARY_PATH is set as:
>> The installation target for this version of OpenMPI is:
>> which mpirun
>> Perhaps something strange is happening with GCC? I've tried simple hello
>world C and Fortran programs, and they work normally.
>> From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Ralph
>> Sent: Sunday, January 19, 2014 11:36 AM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] simple test problem hangs on mpi_finalize
>> and consumes all system resources
>> The OFED warning about registration is something OMPI added at one point
>when we isolated the cause of jobs occasionally hanging, so you won't see
>that warning from other MPIs or earlier versions of OMPI (I forget exactly
>when we added it).
>> The problem you describe doesn't sound like an OMPI issue - it sounds like
>you've got a memory corruption problem in the code. Have you tried running
>the examples in our example directory to confirm that the installation is
>> Also, check to ensure that your LD_LIBRARY_PATH is correctly set to pickup
>the OMPI libs you installed - most Linux distros come with an older version,
>and that can cause problems if you inadvertently pick them up.
>> On Jan 19, 2014, at 5:51 AM, Fischer, Greg A. <fischega_at_[hidden]>
>> I have a simple, 1-process test case that gets stuck on the mpi_finalize call.
>The test case is a dead-simple calculation of pi - 50 lines of Fortran. The
>process gradually consumes more and more memory until the system
>becomes unresponsive and needs to be rebooted, unless the job is killed
>> In the output, attached, I see the warning message about OpenFabrics
>being configured to only allow registering part of physical memory. I've tried
>to chase this down with my administrator to no avail yet. (I am aware of the
>relevant FAQ entry.) A different installation of MPI on the same system,
>made with a different compiler, does not produce the OpenFabrics memory
>registration warning - which seems strange because I thought it was a system
>configuration issue independent of MPI. Also curious in the output is that LSF
>seems to think there are 7 processes and 11 threads associated with this job.
>> The particulars of my configuration are attached and detailed below. Does
>anyone see anything potentially problematic?
>> OpenMPI Version: 1.6.5
>> Compiler: GCC 4.6.1
>> OS: SuSE Linux Enterprise Server 10, Patchlevel 2
>> uname -a : Linux lxlogin2 18.104.22.168-0.21-smp #1 SMP Tue May 6 12:41:02
>> UTC 2008 x86_64 x86_64 x86_64 GNU/Linux
>> Execution command: (executed via LSF - effectively "mpirun -np 1
>> users mailing list
>> users mailing list
>For corporate legal information go to:
>users mailing list
- application/octet-stream attachment: hello.out
- application/octet-stream attachment: ring.out