Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with MPI_BARRIER
From: Eugene Loh (eugene.loh_at_[hidden])
Date: 2011-09-09 12:00:33


On 9/8/2011 11:47 AM, Ghislain Lartigue wrote:
> I guess you're perfectly right!
> I will try to test it tomorrow by putting a call system("wait(X)) befor the barrier!
What does "wait(X)" mean?

Anyhow, here is how I see your computation:

A) The first barrier simply synchronizes the processes.
B) Then you start a bunch of non-blocking, point-to-point messages.
C) Then another barrier.
D) Finally, the point-to-point messages are completed.

Your mental model might be that A, B, and C should be fast and that D
should take a long time. The reality may be that the completion of all
those messages is actually taking place during C.

How about the following?

Barrier
t0 = MPI_Wtime()
start all non-blocking messages
t1 = MPI_Wtime()
Barrier
t2 = MPI_Wtime()
complete all messages
t3 = MPI_Wtime()
Barrier
t4 = MPI_Wtime()

Then, look at the data from all the processes graphically. Compare the
picture to the same experiment, but with middle Barrier missing.
Presumably, the full iteration will take roughly as long in both cases.
The difference, I might expect, would be that with the middle barrier
present, it gets all the time and the message-completion is fast.
Without the middle barrier, the message completion is slow. So, message
completion is taking a long time either way and the only difference is
whether it's taking place during your MPI_Test loop or during what you
thought was only a barrier.

A simple way of doing all this is to run with a time-line profiler...
some MPI performance analysis tool. You won't have to instrument the
code, dump timings, or figure out graphics. Just look at pretty
pictures! There is some description of tool candidates in the OMPI FAQ
at http://www.open-mpi.org/faq/?category=perftools
> PS:
> if anyone has more information about the implementation of the MPI_IRECV() procedure, I would be glad to learn more about it!
I don't know how much detail you want here, but I suspect not much
detail is warranted. There is a lot of complexity here, but I think a
few key ideas will help.

First, I'm pretty sure you're sending "long" messages. OMPI usually
sends such messages by queueing up a request. These requests can, in
the general case, be "progressed" whenever an MPI call is made. So,
whenever you make an MPI call, get away from the thought that you're
doing one specific thing, as specified by the call and its arguments.
Think instead that you will also be looking around to see whatever other
MPI work can be progressed.