Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang
From: Eugene Loh (eugene.loh_at_[hidden])
Date: 2010-11-23 16:07:56


Sébastien Boisvert wrote:

>Now I can describe the cases.
>
>
The test cases can all be explained by the test requiring eager messages
(something that test4096.cpp does not require).

>Case 1: 30 MPI ranks, message size is 4096 bytes
>
>File: mpirun-np-30-Program-4096.txt
>Outcome: It hangs -- I killed the poor thing after 30 seconds or so.
>
>
4096 is rendezvous. For eager, try 4000 or lower.

>Case 2: 30 MPI ranks, message size is 1 byte
>
>File: mpirun-np-30-Program-1.txt.gz
>Outcome: It runs just fine.
>
>
1 byte is eager.

>Case 3: 2 MPI ranks, message size is 4096 bytes
>
>File: mpirun-np-2-Program-4096.txt
>Outcome: It hangs -- I killed the poor thing after 30 seconds or so.
>
>
Same as Case 1.

>Case 4: 30 MPI ranks, message size if 4096 bytes, shared memory is
>disabled
>
>File: mpirun-mca-btl-^sm-np-30-Program-4096.txt.gz
>Outcome: It runs just fine.
>
>
Eager limit for TCP is 65536 (perhaps less some overhead). So, these
messages are eager.