Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Test bug?
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-02-13 17:16:33


Sorry for the delay in replying; I was fully occupied by MPI Forum
activities over the past week or so.

It is quite possible that the reason for the multiple tests is OMPI's
lazy wireup scheme. Making an openfabrics connection likely requires
multiple passes down through OMPI's progression engine (there's some
back-n-forth of information exchange to establish the openfabrics
connection before MPI traffic will flow).

If you do some warmup sends before your test, the connection should be
fully established and then eager messages should flow like you expect;
i.e., if you do a short send, an MPI_Test right after it should mark
its completion, etc.

But just to be clear -- the specific behaviors of this kind of stuff
is very MPI-implementation specific. You should not code your
application to rely on MPI_Test completing the first time for "short"
messages because all kinds of things can change in an MPI's
progression engine, etc.

On Feb 5, 2009, at 2:37 AM, Gabriele Fatigati wrote:

> Dear OpenMPI developer,
> i have found a very strange behaviour of MPI_Test. I'm using OpenMPI
> 1.2 over Infiniband interconnection net.
>
> I've tried to implement net check with a series of MPI_Irecv and
> MPI_Send beetwen processors, testing with MPI_Wait the end of Irecv.
> For strange reasons, i've noted that, when i launch the test in one
> node, it works well. If i launch over 2 or more procs over different
> nodes, MPI_Test fails many time before to tell that the IRecv is
> finished.
>
> I've tried that it fails also after one minutes, with very small
> buffer( less than eager limit). It's impossible that the communication
> is pending after one minutes, with 10 integer sended. To solve this,
> I need to implement a loop over MPI_Test, and only after 3 or 4
> MPI_Test it returns that IRecv finished successful. Is it possible
> that MPI_Test needs to call many time also if the communication is
> already finished?
>
> In attach you have my simple C test program.
>
> Thanks in advance.
>
> --
> Ing. Gabriele Fatigati
>
> Parallel programmer
>
> CINECA Systems & Tecnologies Department
>
> Supercomputing Group
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel: +39 051 6171722
>
> g.fatigati [AT] cineca.it
> <mpi_test5.c>_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems