Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OMPI not calling finalize error
From: David Zhang (solarbikedz_at_[hidden])
Date: 2011-04-02 11:19:06


>From the error message, there is a segfault in the program, which crashes
the one of the process. MPI notices one of the process has died and
terminate the other processes as well. Because these processes were not
terminated by calling MPI_finalize, you get the error message on the bottom.

On Sat, Apr 2, 2011 at 8:05 AM, Jack Bryan <dtustudy68_at_[hidden]> wrote:

> Hi,
>
> When I run a parallel program, I got an error :
> ------------------------------------------------------------------
> [n333:129522] *** Process received signal ***
> [n333:129522] Signal: Segmentation fault (11)
> [n333:129522] Signal code: Address not mapped (1)
> [n333:129522] Failing at address: 0x40
> [n333:129522] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0]
> [n333:129522] [ 1] /opt/openmpi-1.3.4-gnu/lib/libmpi.so.0 [0x4cd19b1]
> [n333:129522] [ 2]
> /opt/openmpi-1.3.4-gnu/lib/libopen-pal.so.0(opal_progress+0x75) [0x52e5165]
> [n333:129522] [ 3] /opt/openmpi-1.3.4-gnu/lib/libopen-rte.so.0 [0x508565c]
> [n333:129522] [ 4] /opt/openmpi-1.3.4-gnu/lib/libmpi.so.0 [0x4c653eb]
> [n333:129522] [ 5] /opt/openmpi-1.3.4-gnu/lib/libmpi.so.0(MPI_Init+0x120)
> [0x4c84b90]
> [n333:129522] [ 6] /lustre/jxding/netplan49/nsga2b [0x4497f6]
> [n333:129522] [ 7] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c5021d974]
> [n333:129522] [ 8]
> /lustre/jxding/netplan49/nsga2b(__gxx_personality_v0+0x499) [0x4436e9]
> [n333:129522] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 24 with PID 129522 on
> node n333 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
>
> ---------------------------------------------------------------------------------------
> But, the program only run for not more than a few of minutes. It should
> take hours to finish.
>
> How can it reach "finalize" so fast ?
>
> Any help is appreciated.
>
> Jack
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
David Zhang
University of California, San Diego