Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] iprobe and opal_progress
From: Terry Dontje (Terry.Dontje_at_[hidden])
Date: 2008-06-18 09:52:38


Jeff Squyres wrote:
> Perhaps we did that as a latency optimization...?
>
> George / Brian / Galen -- do you guys know/remember why this was done?
>
> On the surface, it looks like it would be ok to call progress and
> check again to see if it found the match. Can anyone think of a
> deeper reason not to?
>
If it is ok to check again, my next question is going to be how?
Because after looking at the code some more I found iprobe requests are
not actually queued. So can I just do another
MCA_PML_OB1_RECV_REQUEST_START on the init'd IPROBE_REQUEST after the
call opal_progress to force a search on the unexpected queue or do I
need to FINI the request and regenerate it again?

--td
>
> On Jun 17, 2008, at 11:43 AM, Terry Dontje wrote:
>
>> I've ran into an issue while running hpl where a message has been
>> sent (in shared memory in this case) and the receiver calls iprobe
>> but doesn't see said message the first call to iprobe (even though it
>> is there) but does see it the second call to iprobe. Looking at
>> mca_pml_ob1_iprobe function and the calls it makes it looks like it
>> checks the unexpected queue for matches and if it doesn't find one it
>> sets the flag to 0 (no matches), then calls opal_progress and
>> return. This seems wrong to me since I would expect that the call to
>> opal_progress probably would pull in the message that the iprobe is
>> waiting for.
>>
>> Am I correct in my reading of the code? It seems that maybe some
>> sort of check needs to be done after the call to opal_progress in
>> mca_pml_ob1_iprobe.
>>
>> Attached is a simple program that shows the issue I am running into:
>>
>> #include <mpi.h>
>>
>> int main() {
>> int rank, src[2], dst[2], flag = 0;
>> int nxfers;
>> MPI_Status status;
>>
>> MPI_Init(NULL, NULL);
>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>
>> if (0 == rank) {
>> for (nxfers = 0; nxfers < 5; nxfers++)
>> MPI_Send(src, 2, MPI_INT, 1, 0, MPI_COMM_WORLD);
>> } else if (1 == rank) {
>> for (nxfers = 0; nxfers < 5; nxfers++) {
>> sleep(5);
>> flag = 0;
>> while (!flag) {
>> printf("iprobe...");
>> MPI_Iprobe(0, 0, MPI_COMM_WORLD, &flag, &status);
>> }
>> printf("\n");
>> MPI_Recv(dst, 2, MPI_INT, 0, 0, MPI_COMM_WORLD, &status);
>> }
>> }
>> MPI_Finalize();
>> }
>>
>> --td
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>