Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Bugs in MPI_Abort() -- MPI_Finalize()?
From: Yves Caniou (yves.caniou_at_[hidden])
Date: 2010-06-02 12:23:25


Le Wednesday 02 June 2010 15:55:37, vous avez écrit :
> On Jun 2, 2010, at 9:50 AM, Yves Caniou wrote:
> > I copy the output of my last mail at the end of this one, to avoid
> > searching. Here is the line that I used to configure OMPI:
> >
> > $>./configure --prefix=/home/p10015/openmpi --with-threads=posix
> > --enable-mpi-threads --enable-progress-threads
> > --enable-mpirun-prefix-by-default --enable-sparse-groups
>
> My bad -- I missed that.

In fact, it's my fault.. It's the first time I gave this line, and forgot to
copy/paste the output (added in this mail).

> --enable-progress-threads is likely the culprit here. That option is VERY
> poorly tested and likely does not work. Can you try without that?

Now, it seems to work pretty well, both versions (MPI_Finalize() and
MPI_Abort())! Many thanks!

.Yves.

> ###########################################
> $>mpiexec -mca btl sm -n 4 magic-square -b 1 -a 5000 -r 200 -C 1000 -z 20
> 100
> --------------------------------------------------------------------------
> At least one pair of MPI processes are unable to reach each other for
> MPI communications. This means that no Open MPI device has indicated
> that it can be used to communicate between these processes. This is
> an error; Open MPI requires that all MPI processes be able to reach
> each other. This error can sometimes be the result of forgetting to
> specify the "self" BTL.
>
> Process 1 ([[34810,1],3]) is on host: ha8000-1
> Process 2 ([[34810,1],3]) is on host: ha8000-1
> BTLs attempted: sm
>
> Your MPI job is now going to abort; sorry.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> PML add procs failed
> --> Returned "Unreachable" (-12) instead of "Success" (0)
> --------------------------------------------------------------------------
> *** The MPI_Init() function was called before MPI_INIT was invoked.
> *** This is disallowed by the MPI standard.
> *** Your MPI job will now abort.
> [ha8000-1:461] Abort before MPI_INIT completed successfully; not able to
> guarantee that all other processes were killed!
> *** The MPI_Init() function was called before MPI_INIT was invoked.
> *** This is disallowed by the MPI standard.
> *** The MPI_Init() function was called before MPI_INIT was invoked.
> *** This is disallowed by the MPI standard.
> *** Your MPI job will now abort.
> [ha8000-1:464] Abort before MPI_INIT completed successfully; not able to
> guarantee that all other processes were killed!
> *** Your MPI job will now abort.
> [ha8000-1:462] Abort before MPI_INIT completed successfully; not able to
> guarantee that all other processes were killed!
> *** The MPI_Init() function was called before MPI_INIT was invoked.
> *** This is disallowed by the MPI standard.
> *** Your MPI job will now abort.
> [ha8000-1:463] Abort before MPI_INIT completed successfully; not able to
> guarantee that all other processes were killed!
> --------------------------------------------------------------------------
> mpiexec has exited due to process rank 0 with PID 461 on
> node ha8000-1 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpiexec (as reported here).
> --------------------------------------------------------------------------
>
> Killed
> ################################################
>

-- 
Yves Caniou
Associate Professor at Université Lyon 1,
Member of the team project INRIA GRAAL in the LIP ENS-Lyon,
Délégation CNRS in Japan French Laboratory of Informatics (JFLI),
  * in Information Technology Center, The University of Tokyo,
    2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-8658, Japan
    tel: +81-3-5841-0540
  * in National Institute of Informatics
    2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
    tel: +81-3-4212-2412 
http://graal.ens-lyon.fr/~ycaniou/