Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OMPI error in MPI_Cart_create (in code that works withMPICH2)
From: Greg Fischer (greg.a.fischer_at_[hidden])
Date: 2009-09-02 10:27:11


Thanks, Jeff.

OK, I've found the offending code and gotten rid of the fork() warning. I'm
still left with this:

[bl302:26556] *** An error occurred in MPI_Cart_create
[bl302:26556] *** on communicator MPI_COMM_WORLD
[bl302:26556] *** MPI_ERR_ARG: invalid argument of some other kind
[bl302:26556] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--------------------------------------------------------------------------
mpirun has exited due to process rank 4 with PID 13693 on
node bl316 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[bl316:13691] 7 more processes have sent help message help-mpi-errors.txt /
mpi_errors_are_fatal
[bl316:13691] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
help / error messages

I'm going to try re-compiling OpenMPI, itself, with the Intel compilers.
Any other ideas?

On Wed, Sep 2, 2009 at 12:01 AM, Jeff Squyres <jsquyres_at_[hidden]> wrote:

> *Something* in your code is calling fork() -- it may be an indirect call
> such as system() or popen() or somesuch. This particular error message is
> only printed during a "fork pre-hook" that Open MPI installs during MPI_INIT
> (registered via pthread_atfork()).
>
> Grep through your code for calls to system and popen -- see if any of these
> are used.
>
> There is no functional difference between "include 'mpif.h'" and "use mpi"
> in terms of MPI functionality at run time -- the only difference you get is
> a "better" level of compile-time protection from the Fortran compiler.
> Specifically, "use mpi" will introduce strict type checking for many (but
> not all) of the MPI functions at compile time. Hence, the compiler will
> complain if you forget an IERR parameter to an MPI function, for example.
>
> "use mpi" is not perfect, though -- there are many well-documented problems
> because of the design of the MPI-2 Fortran 90 interface (which are currently
> being addressed in MPI-3, if you care :-) ). More generally: "use mpi" will
> catch *many* compile errors, but not *all* of them.
>
> But to answer your question succinctly: this problem won't be affected by
> using "use mpi" or "include 'mpif.h'".
>
>
>
>
> On Sep 1, 2009, at 9:02 PM, Greg Fischer wrote:
>
> I'm receiving the error posted at the bottom of this message with a code
>> compiled with Intel Fortran/C Version 11.1 against OpenMPI version 1.3.2.
>>
>> The same code works correctly when compiled against MPICH2. (We have
>> re-compiled with OpenMPI to take advantage of newly-installed Infiniband
>> hardware. The "ring" test problem appears to work correctly over
>> Infiniband.)
>>
>> There are no "fork()" calls in our code, so I can only guess that
>> something weird is going on with MPI_COMM_WORLD. The code in question is a
>> Fortran 90 code. Right now, it is being compiled with "include 'mpif.h'"
>> statements at the beginning of each MPI subroutine, instead of making use
>> of the "mpi" modules. Could this be causing the problem? How else should I
>> go about diagnosing the problem?
>>
>> Thanks,
>> Greg
>>
>> --------------------------------------------------------------------------
>> An MPI process has executed an operation involving a call to the
>> "fork()" system call to create a child process. Open MPI is currently
>> operating in a condition that could result in memory corruption or
>> other system errors; your MPI job may hang, crash, or produce silent
>> data corruption. The use of fork() (or system() or other calls that
>> create child processes) is strongly discouraged.
>>
>> The process that invoked fork was:
>>
>> Local host: bl316 (PID 26806)
>> MPI_COMM_WORLD rank: 0
>>
>> If you are *absolutely sure* that your application will successfully
>> and correctly survive a call to fork(), you may disable this warning
>> by setting the mpi_warn_on_fork MCA parameter to 0.
>> --------------------------------------------------------------------------
>> [bl205:5014] *** An error occurred in MPI_Cart_create
>> [bl205:5014] *** on communicator MPI_COMM_WORLD
>> [bl205:5014] *** MPI_ERR_ARG: invalid argument of some other kind
>> [bl205:5014] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>>
>> --------------------------------------------------------------------------
>> mpirun has exited due to process rank 4 with PID 5010 on
>> node bl205 exiting without calling "finalize". This may
>> have caused other processes in the application to be
>> terminated by signals sent by mpirun (as reported here).
>> --------------------------------------------------------------------------
>> [bl205:05008] 7 more processes have sent help message help-mpi-errors.txt
>> / mpi_errors_are_fatal
>> [bl205:05008] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
>> help / error messages
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>