Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2005-10-31 16:52:14


FWIW, you can grab from Subversion (see http://www.open-mpi.org/svn/)
or grab a nightly snapshot tarball (Tim's changes went into the trunk
-- they have not yet been ported over to the 1.0 release branch; he
wants to verify before porting: http://www.open-mpi.org/nightly/trunk/
)

On Oct 31, 2005, at 2:07 PM, Tim S. Woodall wrote:

> Mike,
>
> Let me confirm this was the issue and look at the TCP problem as well.
> Will let you know.
>
> Thanks,
> Tim
>
>
> Mike Houston wrote:
>> What's the ETA, or should I try grabbing from cvs?
>>
>> -Mike
>>
>> Tim S. Woodall wrote:
>>
>>
>>> Mike,
>>>
>>> I believe was probably corrected today and should be in the
>>> next release candidate.
>>>
>>> Thanks,
>>> Tim
>>>
>>> Mike Houston wrote:
>>>
>>>
>>>
>>>> Woops, spoke to soon. The performance quoted was not actually going
>>>> between nodes. Actually using the network with the pinned option
>>>> gives:
>>>>
>>>> [0,1,0][btl_mvapi_component.c:631:mca_btl_mvapi_component_progress]
>>>> [0,1,1][btl_mvapi_component.c:631:mca_btl_mvapi_component_progress]
>>>> Got
>>>> error : VAPI_WR_FLUSH_ERR, Vendor code : 0 Frag : 0xb74a1c18Got
>>>> error :
>>>> VAPI_WR_FLUSH_ERR, Vendor code : 0 Frag : 0xb73e1720
>>>>
>>>> repeated many times.
>>>>
>>>> -Mike
>>>>
>>>> Mike Houston wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> That seems to work with the pinning option enabled. THANKS!
>>>>>
>>>>> Now I'll go back to testing my real code. I'm getting 700MB/s for
>>>>> messages >=128KB. This is a little bit lower than MVAPICH,
>>>>> 10-20%, but
>>>>> still pretty darn good. My guess is that I can play with the
>>>>> setting
>>>>> more to tweak up performance. Now if I can get the tcp layer
>>>>> working,
>>>>> I'm pretty much good to go.
>>>>>
>>>>> Any word on an SDP layer? I can probably modify the tcp layer
>>>>> quickly
>>>>> to do SDP, but I thought I would ask.
>>>>>
>>>>> -Mike
>>>>>
>>>>> Tim S. Woodall wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Hello Mike,
>>>>>>
>>>>>> Mike Houston wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> When only sending a few messages, we get reasonably good IB
>>>>>>> performance,
>>>>>>> ~500MB/s (MVAPICH is 850MB/s). However, if I crank the number of
>>>>>>> messages up, we drop to 3MB/s(!!!). This is with the OSU NBCL
>>>>>>> mpi_bandwidth test. We are running Mellanox IB Gold 1.8 with
>>>>>>> 3.3.3
>>>>>>> firmware on PCI-X (Couger) boards. Everything works with
>>>>>>> MVAPICH, but
>>>>>>> we really need the thread support in OpenMPI.
>>>>>>>
>>>>>>> Ideas? I noticed there are a plethora of runtime options
>>>>>>> configurable
>>>>>>> for mvapi. Do I need to tweak these to get performacne up?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> You might try running w/ the:
>>>>>>
>>>>>> mpirun -mca mpi_leave_pinned 1
>>>>>>
>>>>>> Which will cause mvapi port to maintain an mru cache of
>>>>>> registrations,
>>>>>> rather than dynamically pinning/unpinning memory.
>>>>>>
>>>>>> If this does not resolve the BW problems, try increasing the
>>>>>> resources allocated to each connection:
>>>>>>
>>>>>> -mca btl_mvapi_rd_min 128
>>>>>> -mca btl_mvapi_rd_max 256
>>>>>>
>>>>>> Also can you forward me a copy of the test code or a reference to
>>>>>> it?
>>>>>>
>>>>>> Thanks,
>>>>>> Tim
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/