Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] iprobe and opal_progress
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-06-18 09:10:41


Perhaps we did that as a latency optimization...?

George / Brian / Galen -- do you guys know/remember why this was done?

On the surface, it looks like it would be ok to call progress and
check again to see if it found the match. Can anyone think of a
deeper reason not to?

On Jun 17, 2008, at 11:43 AM, Terry Dontje wrote:

> I've ran into an issue while running hpl where a message has been
> sent (in shared memory in this case) and the receiver calls iprobe
> but doesn't see said message the first call to iprobe (even though
> it is there) but does see it the second call to iprobe. Looking at
> mca_pml_ob1_iprobe function and the calls it makes it looks like it
> checks the unexpected queue for matches and if it doesn't find one
> it sets the flag to 0 (no matches), then calls opal_progress and
> return. This seems wrong to me since I would expect that the call
> to opal_progress probably would pull in the message that the iprobe
> is waiting for.
>
> Am I correct in my reading of the code? It seems that maybe some
> sort of check needs to be done after the call to opal_progress in
> mca_pml_ob1_iprobe.
>
> Attached is a simple program that shows the issue I am running into:
>
> #include <mpi.h>
>
> int main() {
> int rank, src[2], dst[2], flag = 0;
> int nxfers;
> MPI_Status status;
>
> MPI_Init(NULL, NULL);
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>
> if (0 == rank) {
> for (nxfers = 0; nxfers < 5; nxfers++)
> MPI_Send(src, 2, MPI_INT, 1, 0, MPI_COMM_WORLD);
> } else if (1 == rank) {
> for (nxfers = 0; nxfers < 5; nxfers++) {
> sleep(5);
> flag = 0;
> while (!flag) {
> printf("iprobe...");
> MPI_Iprobe(0, 0, MPI_COMM_WORLD, &flag, &status);
> }
> printf("\n");
> MPI_Recv(dst, 2, MPI_INT, 0, 0, MPI_COMM_WORLD, &status);
> }
> }
> MPI_Finalize();
> }
>
> --td
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems