Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] mpi_barrier
From: Huangwei (hz283_at_[hidden])
Date: 2013-09-29 07:11:51


Dear George,

Please see below.

On 29 September 2013 01:03, George Bosilca <bosilca_at_[hidden]> wrote:

>
> On Sep 29, 2013, at 01:19 , Huangwei <hz283_at_[hidden]> wrote:
>
> Dear All,
>
> In my code I implement mpi_send/mpi_receive for an three dimensional real
> array, and process is as follows:
>
> all other processors send the array to rank 0 and then rank 0 receives the
> array and put these arrays into a complete array. Then mpi_bcast is called
> to send the complete array from rank 0 to all others.
>
>
> This pattern of communication reminds me of an MPI_Allgather (or the more
> flexible version MPI_Allgatherv).
>
I tried MPI_Allgatherv in my case and found that it is a little slower
than mpi_send and mpi_recv pairs. The array that needed to be transferred
is not small. Generally, from your experience which option is more
efficient (need less wall time for this data transferring of large data).
Thanks.

> This is very basic usage of mpi_send and mpi_receive. In my fortran code
> I found that if I added call mpi_barrier(...) before the mpi_send and
> mpi_receive statements the wall time (60s) for this sending and receiving
> will be much lower than that if mpi_barrier is not called (2s). I used
> mpi_wtime to count the time.
>
>
> In a parallel application each process is out of sync to the others. I
> have no idea how you measure your time in the original version but I guess
> that in the MPI_Barrier case you start your timer after the barrier. As the
> barrier put in sync all processes, you only measure the real time to
> exchange the data, which might seem shorter.
>
> I think mpi_send and mpi_recv are blocking subroutines and thus no
> additional mpi_barrier is needed. Can anybody tell me what is the reason
> for this phenomena? Thank you very much.
>
>
> Yes, these operations are indeed blocking, which is why you see the
> slowdown. If one single process is late to send its contribution, the
> entire operation is be penalized (as the root , aka. process zero, is
> waiting for contributions in order). So you should either try to use the
> collective pattern I highlighted before, switch to using non-blocking
> point-to-point instead of blocking, or look into the potential benefit of
> using a non-blocking collective.
>
> George.
>
>
> best regards,
> Huangwei
>
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>