Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Terry D. Dontje (Terry.Dontje_at_[hidden])
Date: 2007-08-29 09:58:46


Trunk.

--td
Gleb Natapov wrote:

>Is this trunk or 1.2?
>
>On Wed, Aug 29, 2007 at 09:40:30AM -0400, Terry D. Dontje wrote:
>
>
>>I have a program that does a simple bucket brigade of sends and receives
>>where rank 0 is the start and repeatedly sends to rank 1 until a certain
>>amount of time has passed and then it sends and all done packet.
>>
>>Running this under np=2 always works. However, when I run with greater
>>than 2 using only the SM btl the program usually hangs and one of the
>>processes has a long stack that has a lot of the following 3 calls in it:
>>
>> [25] opal_progress(), line 187 in "opal_progress.c"
>> [26] mca_btl_sm_component_progress(), line 397 in "btl_sm_component.c"
>> [27] mca_bml_r2_progress(), line 110 in "bml_r2.c"
>>
>>When stepping through the ompi_fifo_write_to_head routine it looks like
>>the fifo has overflowed.
>>
>>I am wondering if what is happening is rank 0 has sent a bunch of
>>messages that have exhausted the
>>resources such that one of the middle ranks which is in the process of
>>sending cannot send and therefore
>>never gets to the point of trying to receive the messages from rank 0?
>>
>>Is the above a possible scenario or are messages periodically bled off
>>the SM BTL's fifos?
>>
>>Note, I have seen np=3 pass sometimes and I can get it to pass reliably
>>if I raise the shared memory space used by the BTL. This is using the
>>trunk.
>>
>>
>>--td
>>
>>
>>_______________________________________________
>>devel mailing list
>>devel_at_[hidden]
>>http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>
>--
> Gleb.
>_______________________________________________
>devel mailing list
>devel_at_[hidden]
>http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>