Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-10-31 12:27:48


I think you should use the MPI_PROC_NULL constant itself, not a hard-
coded value of -1.

Specifically: the value of MPI_PROC_NULL is not set in the MPI
standard -- so implementations are free to choose whatever value they
want. In Open MPI, MPI_PROC_NULL is -2. So using -1 is an illegal
destination, and you therefore get the error that you described.

On Oct 31, 2007, at 9:00 AM, Karsten Bolding wrote:

> Hello
>
> I've just introduced the possibility to use OpenMPI instead of
> MPICH in
> an ocean model. The code is quite well tested and has being run in
> various parallel setups by various groups.
>
> I've compiled the program using mpif90 (instead of ifort). When I
> run I
> get the error - shown at the end of this mail.
>
> As you can see all 13 jobs are started - but then ...
>
> One problem with ocean models using domain decomposition in
> relation to
> load balancing is that the computational burden of the equal sized
> domain is not the same (the different domains have different
> land-fractions). To overcome this a matlab tool has been developed
> that
> allows for assigning more sub-doamins to one processor/core based
> on the
> sum of water-points in the sub-domains. Attached is a figure
> showing the
> actual setup in this case. The neighbor relation is read from a file
> produced by said matlab-tool. Non-existing neighbors are set to -1
> - MPI_PROC_NULL in MPICH.
>
> The setup is run on a quad-core machine for testing purposes only.
>
> Any ideas what goes wrong?
>
>
> ==== error ======
> kb_at_gate:~/DK/setups/north_sea_fine$ mpirun -np 13
> bin/getm_prod_IFORT.96x96
> Process 0 of 13 is alive on gate
> [gate:18564] *** An error occurred in MPI_Isend
> [gate:18564] *** on communicator MPI_COMM_WORLD
> [gate:18564] *** MPI_ERR_RANK: invalid rank
> [gate:18564] *** MPI_ERRORS_ARE_FATAL (goodbye)
> Process 1 of 13 is alive on gate
> [gate:18565] *** An error occurred in MPI_Isend
> [gate:18565] *** on communicator MPI_COMM_WORLD
> [gate:18565] *** MPI_ERR_RANK: invalid rank
> [gate:18565] *** MPI_ERRORS_ARE_FATAL (goodbye)
> Process 2 of 13 is alive on gate
> Process 3 of 13 is alive on gate
> [gate:18567] *** An error occurred in MPI_Isend
> [gate:18567] *** on communicator MPI_COMM_WORLD
> [gate:18567] *** MPI_ERR_RANK: invalid rank
> [gate:18567] *** MPI_ERRORS_ARE_FATAL (goodbye)
> Process 4 of 13 is alive on gate
> [gate:18568] *** An error occurred in MPI_Isend
> [gate:18568] *** on communicator MPI_COMM_WORLD
> [gate:18568] *** MPI_ERR_RANK: invalid rank
> [gate:18568] *** MPI_ERRORS_ARE_FATAL (goodbye)
> Process 5 of 13 is alive on gate
> [gate:18569] *** An error occurred in MPI_Isend
> [gate:18569] *** on communicator MPI_COMM_WORLD
> [gate:18569] *** MPI_ERR_RANK: invalid rank
> [gate:18569] *** MPI_ERRORS_ARE_FATAL (goodbye)
> Process 7 of 13 is alive on gate
> [gate:18571] *** An error occurred in MPI_Isend
> [gate:18571] *** on communicator MPI_COMM_WORLD
> [gate:18571] *** MPI_ERR_RANK: invalid rank
> [gate:18571] *** MPI_ERRORS_ARE_FATAL (goodbye)
> Process 8 of 13 is alive on gate
> Process 9 of 13 is alive on gate
> [gate:18573] *** An error occurred in MPI_Isend
> [gate:18573] *** on communicator MPI_COMM_WORLD
> [gate:18573] *** MPI_ERR_RANK: invalid rank
> [gate:18573] *** MPI_ERRORS_ARE_FATAL (goodbye)
> Process 10 of 13 is alive on gate
> [gate:18574] *** An error occurred in MPI_Isend
> [gate:18574] *** on communicator MPI_COMM_WORLD
> [gate:18574] *** MPI_ERR_RANK: invalid rank
> [gate:18574] *** MPI_ERRORS_ARE_FATAL (goodbye)
> Process 11 of 13 is alive on gate
> Process 12 of 13 is alive on gate
> [gate:18576] *** An error occurred in MPI_Isend
> [gate:18576] *** on communicator MPI_COMM_WORLD
> [gate:18576] *** MPI_ERR_RANK: invalid rank
> [gate:18576] *** MPI_ERRORS_ARE_FATAL (goodbye)
> [gate:18566] *** An error occurred in MPI_Isend
> [gate:18566] *** on communicator MPI_COMM_WORLD
> [gate:18566] *** MPI_ERR_RANK: invalid rank
> [gate:18566] *** MPI_ERRORS_ARE_FATAL (goodbye)
> [gate:18572] *** An error occurred in MPI_Isend
> [gate:18572] *** on communicator MPI_COMM_WORLD
> [gate:18572] *** MPI_ERR_RANK: invalid rank
> [gate:18572] *** MPI_ERRORS_ARE_FATAL (goodbye)
> [gate:18575] *** An error occurred in MPI_Isend
> [gate:18575] *** on communicator MPI_COMM_WORLD
> [gate:18575] *** MPI_ERR_RANK: invalid rank
> [gate:18575] *** MPI_ERRORS_ARE_FATAL (goodbye)
> Process 6 of 13 is alive on gate
> [gate:18570] *** An error occurred in MPI_Isend
> [gate:18570] *** on communicator MPI_COMM_WORLD
> [gate:18570] *** MPI_ERR_RANK: invalid rank
> [gate:18570] *** MPI_ERRORS_ARE_FATAL (goodbye)
> [gate:18561] [0,0,0] ORTE_ERROR_LOG: Timeout in file
> base/pls_base_orted_cmds.c at line 275
> [gate:18561] [0,0,0] ORTE_ERROR_LOG: Timeout in file
> pls_rsh_module.c at
> line 1166
>
>
>
> --
> ----------------------------------------------------------------------
> Karsten Bolding Bolding & Burchard Hydrodynamics
> Strandgyden 25 Phone: +45 64422058
> DK-5466 Asperup Fax: +45 64422068
> Denmark Email: karsten_at_[hidden]
>
> http://www.findvej.dk/Strandgyden25,5466,11,3
> ----------------------------------------------------------------------
> <mask.fine.size0096x0096_offset-0078x-0022_nodes004.distribution_on_no
> des.png>
> <mime-attachment.txt>

-- 
Jeff Squyres
Cisco Systems