Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang
From: George Bosilca (bosilca_at_[hidden])
Date: 2010-11-23 17:19:19


No message is eager if there is congestion. 64K is eager for TCP only if the kernel buffer has enough room to hold the 64k. For SM it only works if there are ready buffers. In fact, eager is an optimization of the MPI library, not something the users should be aware of, or base their application on this particular behavior.

On the MPI 2.2 there is a specific paragraph that advice the users not to do it.

  george.

On Nov 23, 2010, at 16:07 , Eugene Loh wrote:

> Sébastien Boisvert wrote:
>
>> Now I can describe the cases.
>>
> The test cases can all be explained by the test requiring eager messages (something that test4096.cpp does not require).
>
>> Case 1: 30 MPI ranks, message size is 4096 bytes
>>
>> File: mpirun-np-30-Program-4096.txt
>> Outcome: It hangs -- I killed the poor thing after 30 seconds or so.
>>
> 4096 is rendezvous. For eager, try 4000 or lower.
>
>> Case 2: 30 MPI ranks, message size is 1 byte
>>
>> File: mpirun-np-30-Program-1.txt.gz
>> Outcome: It runs just fine.
>>
> 1 byte is eager.
>
>> Case 3: 2 MPI ranks, message size is 4096 bytes
>>
>> File: mpirun-np-2-Program-4096.txt
>> Outcome: It hangs -- I killed the poor thing after 30 seconds or so.
>>
> Same as Case 1.
>
>> Case 4: 30 MPI ranks, message size if 4096 bytes, shared memory is
>> disabled
>>
>> File: mpirun-mca-btl-^sm-np-30-Program-4096.txt.gz
>> Outcome: It runs just fine.
>>
> Eager limit for TCP is 65536 (perhaps less some overhead). So, these messages are eager.
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel