On Sun, 13 Nov 2005 17:53:40 -0700, Jeff Squyres <jsquyres_at_[hidden]>
> I can't believe I missed that, sorry. :-(
> None of the btl's are capable of doing loopback communication except
> "self." Hence, you really can't run "--mca btl foo" if your app ever
> sends to itself -- you really need to run "--mca btl foo,self" at a
> This is not so much an optimization as it is a software engineering
> decision; we didn't have to include the special send-to-self case in
> any of the other btl components this way (i.e., less code, less complex
> On Nov 13, 2005, at 7:12 PM, Brian Barrett wrote:
>> One other thing I noticed... You specify -mca btl openib. Try
>> specifying -mca btl openib,self. The self component is used for
>> "send to self" operations. This could be the cause of your failures.
>> On Nov 13, 2005, at 3:02 PM, Jeff Squyres wrote:
>>> Troy --
>>> Were you perchance using multiple processes per node? If so, we
>>> literally just fixed some sm btl bugs that could have been affecting
>>> you (they could have caused hangs). They're fixed in the nightly
>>> snapshots from today (both trunk and v1.0): r8140. If you were using
>>> the sm btl and multiple processes per node, could you try again?
As a matter of fact, yes; one process per CPU, each node having 2 CPUs.
If I change my machinefile to only use one process per node (leaving one
CPU idle), the problem dissapears. However, if I use two CPU's per node
(but the same number of overall processes -- meaning half the number of
nodes), I recieve the same error:
posting send request errno says Resource temporarily unavailable
error in posting pending send
This happens on both RC5 and RC6, with '-mca btl openib' or '-mca btl
On a positive note, I've now been able to complete the 'com' Presta
benchmark with GM (which I had previously been unable to do)
And informationally: I was using MX version 1.0.3. I have just installed
1.1.0, and I'll be checking that out presently.