Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Troy Telford (ttelford_at_[hidden])
Date: 2005-11-14 11:43:04


On Sun, 13 Nov 2005 17:53:40 -0700, Jeff Squyres <jsquyres_at_[hidden]>
wrote:

> I can't believe I missed that, sorry. :-(
>
> None of the btl's are capable of doing loopback communication except
> "self." Hence, you really can't run "--mca btl foo" if your app ever
> sends to itself -- you really need to run "--mca btl foo,self" at a
> minimum.
>
> This is not so much an optimization as it is a software engineering
> decision; we didn't have to include the special send-to-self case in
> any of the other btl components this way (i.e., less code, less complex
> maintenance).
>
>
> On Nov 13, 2005, at 7:12 PM, Brian Barrett wrote:
>
>> One other thing I noticed... You specify -mca btl openib. Try
>> specifying -mca btl openib,self. The self component is used for
>> "send to self" operations. This could be the cause of your failures.
>> Brian
>>
>> On Nov 13, 2005, at 3:02 PM, Jeff Squyres wrote:
>>
>>> Troy --
>>>
>>> Were you perchance using multiple processes per node? If so, we
>>> literally just fixed some sm btl bugs that could have been affecting
>>> you (they could have caused hangs). They're fixed in the nightly
>>> snapshots from today (both trunk and v1.0): r8140. If you were using
>>> the sm btl and multiple processes per node, could you try again?

As a matter of fact, yes; one process per CPU, each node having 2 CPUs.

If I change my machinefile to only use one process per node (leaving one
CPU idle), the problem dissapears. However, if I use two CPU's per node
(but the same number of overall processes -- meaning half the number of
nodes), I recieve the same error:
***
[0,1,0][btl_openib_endpoint.c:136:mca_btl_openib_endpoint_post_send] error
posting send request errno says Resource temporarily unavailable
[0,1,0][btl_openib_component.c:655:mca_btl_openib_component_progress]
error in posting pending send
***

This happens on both RC5 and RC6, with '-mca btl openib' or '-mca btl
openib,self'

On a positive note, I've now been able to complete the 'com' Presta
benchmark with GM (which I had previously been unable to do)

And informationally: I was using MX version 1.0.3. I have just installed
1.1.0, and I'll be checking that out presently.