Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Windows: MPI_Allreduce() crashes when using MPI_DOUBLE_PRECISION
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-05-10 09:54:31


On May 10, 2011, at 2:30 AM, hi wrote:

>> You didn't answer my prior questions. :-)
> I am observing this crash using MPI_ALLREDUCE() in test program; and
> which does not have any memory corruption issue. ;)

Can you send the info listed on the help page?

>> I ran your test program with -np 2 and -np 4 and it seemed to work ok.
> Can you please let me know what environment (including os, compilers)
> are you using?

RHEL 5.4, gcc 4.5.

This could be a Windows-specific thing, but I would find that unlikely (but heck, I don't know much about Windows...).

> I am able to reproduce the crash using attached simplified test
> program with 5 element array.
> Please note that these experiments I am doing on Windows7 using
> msys/mingw console; see attached makefile for more information.
>
> When running this program as "C:\>mpirun mar_f_dp2.exe" it works fine;
> but when running it as "C:\>mpirun -np 2 mar_f_dp2.exe" it generates
> following error on console...
>
> C:\>mpirun -np 2 mar_f_dp2.exe
> 0
> 0
> 0
> size= 2 , rank= 0
> start --
> 0
> 0
> 0
> size= 2 , rank= 1
> start --
> forrtl: severe (157): Program Exception - access violation
> Image PC Routine Line Source
> [vibgyor:09168] [[28311,0],0]-[[28311,1],0] mca_oob_tcp_msg_recv:
> readv failed: Unknown error (108)

You forgot ierr in the call to MPI_Finalize. You also paired DOUBLE_PRECISION data with MPI_INTEGER in the call to allreduce. And you mixed sndbuf and rcvbuf in the call to allreduce, meaning that when your print rcvbuf afterwards, it'll always still be 0.

I pared your sample program down to the following:

        program Test_MPI
            use mpi
            implicit none

            DOUBLE PRECISION rcvbuf(5), sndbuf(5)
            INTEGER nproc, rank, ierr, n, i, ret

            n = 5
            do i = 1, n
                sndbuf(i) = 2.0
                rcvbuf(i) = 0.0
            end do

            call MPI_INIT(ierr)
            call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
            call MPI_COMM_SIZE(MPI_COMM_WORLD, nproc, ierr)
            write(*,*) "size=", nproc, ", rank=", rank
            write(*,*) "start --, rcvbuf=", rcvbuf
            CALL MPI_ALLREDUCE(sndbuf, rcvbuf, n,
     & MPI_DOUBLE_PRECISION, MPI_SUM, MPI_COMM_WORLD, ierr)
            write(*,*) "end --, rcvbuf=", rcvbuf

            CALL MPI_Finalize(ierr)
        end

(you could use "include 'mpif.h'", too -- I tried both)

This program works fine for me.

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/