Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OMPI error in MPI_Cart_create (in code that works withMPICH2)
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-09-02 00:01:03


*Something* in your code is calling fork() -- it may be an indirect
call such as system() or popen() or somesuch. This particular error
message is only printed during a "fork pre-hook" that Open MPI
installs during MPI_INIT (registered via pthread_atfork()).

Grep through your code for calls to system and popen -- see if any of
these are used.

There is no functional difference between "include 'mpif.h'" and "use
mpi" in terms of MPI functionality at run time -- the only difference
you get is a "better" level of compile-time protection from the
Fortran compiler. Specifically, "use mpi" will introduce strict type
checking for many (but not all) of the MPI functions at compile time.
Hence, the compiler will complain if you forget an IERR parameter to
an MPI function, for example.

"use mpi" is not perfect, though -- there are many well-documented
problems because of the design of the MPI-2 Fortran 90 interface
(which are currently being addressed in MPI-3, if you care :-) ).
More generally: "use mpi" will catch *many* compile errors, but not
*all* of them.

But to answer your question succinctly: this problem won't be affected
by using "use mpi" or "include 'mpif.h'".

On Sep 1, 2009, at 9:02 PM, Greg Fischer wrote:

> I'm receiving the error posted at the bottom of this message with a
> code compiled with Intel Fortran/C Version 11.1 against OpenMPI
> version 1.3.2.
>
> The same code works correctly when compiled against MPICH2. (We
> have re-compiled with OpenMPI to take advantage of newly-installed
> Infiniband hardware. The "ring" test problem appears to work
> correctly over Infiniband.)
>
> There are no "fork()" calls in our code, so I can only guess that
> something weird is going on with MPI_COMM_WORLD. The code in
> question is a Fortran 90 code. Right now, it is being compiled with
> "include 'mpif.h'" statements at the beginning of each MPI
> subroutine, instead of making use of the "mpi" modules. Could this
> be causing the problem? How else should I go about diagnosing the
> problem?
>
> Thanks,
> Greg
>
> --------------------------------------------------------------------------
> An MPI process has executed an operation involving a call to the
> "fork()" system call to create a child process. Open MPI is currently
> operating in a condition that could result in memory corruption or
> other system errors; your MPI job may hang, crash, or produce silent
> data corruption. The use of fork() (or system() or other calls that
> create child processes) is strongly discouraged.
>
> The process that invoked fork was:
>
> Local host: bl316 (PID 26806)
> MPI_COMM_WORLD rank: 0
>
> If you are *absolutely sure* that your application will successfully
> and correctly survive a call to fork(), you may disable this warning
> by setting the mpi_warn_on_fork MCA parameter to 0.
> --------------------------------------------------------------------------
> [bl205:5014] *** An error occurred in MPI_Cart_create
> [bl205:5014] *** on communicator MPI_COMM_WORLD
> [bl205:5014] *** MPI_ERR_ARG: invalid argument of some other kind
> [bl205:5014] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 4 with PID 5010 on
> node bl205 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
> [bl205:05008] 7 more processes have sent help message help-mpi-
> errors.txt / mpi_errors_are_fatal
> [bl205:05008] Set MCA parameter "orte_base_help_aggregate" to 0 to
> see all help / error messages
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]