Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Tim S. Woodall (twoodall_at_[hidden])
Date: 2005-11-10 11:10:56


Mike,

I believe this issue has been corrected on the trunk, and should
be in the next release candidate, probably by the end of the week.

Thanks,
Tim

Mike Houston wrote:
> mpirun -mca btl_mvapi_rd_min 128 -mca btl_mvapi_rd_max 256 -np 2
> -hostfile /u/mhouston/mpihosts mpi_bandwidth 21 131072
>
> 131072 519.922184 (MillionBytes/sec) 495.836433(MegaBytes/sec)
>
> mpirun -mca btl_mvapi_rd_min 128 -mca btl_mvapi_rd_max 256 -np 2
> -hostfile /u/mhouston/mpihosts mpi_bandwidth 22 131072
>
> 131072 3.360296 (MillionBytes/sec) 3.204628(MegaBytes/sec)
>
> Moving from 21 messages to 22 causes a HUGE drop in performance. The
> app tries to send all of the messages non-blocking at once... Setting
> -mca mpi_leave_pinned 1 causes:
>
> [0,1,1][btl_mvapi_component.c:631:mca_btl_mvapi_component_progress] Got
> error : VAPI_WR_FLUSH_ERR, Vendor code : 0 Frag : 0xb73412fc
>
> repeated until it eventually hangs.
>
> -Mike
>
> Mike Houston wrote:
>
>
>>Woops, spoke to soon. The performance quoted was not actually going
>>between nodes. Actually using the network with the pinned option gives:
>>
>>[0,1,0][btl_mvapi_component.c:631:mca_btl_mvapi_component_progress]
>>[0,1,1][btl_mvapi_component.c:631:mca_btl_mvapi_component_progress] Got
>>error : VAPI_WR_FLUSH_ERR, Vendor code : 0 Frag : 0xb74a1c18Got error :
>>VAPI_WR_FLUSH_ERR, Vendor code : 0 Frag : 0xb73e1720
>>
>>repeated many times.
>>
>>-Mike
>>
>>Mike Houston wrote:
>>
>>
>>
>>
>>>That seems to work with the pinning option enabled. THANKS!
>>>
>>>Now I'll go back to testing my real code. I'm getting 700MB/s for
>>>messages >=128KB. This is a little bit lower than MVAPICH, 10-20%, but
>>>still pretty darn good. My guess is that I can play with the setting
>>>more to tweak up performance. Now if I can get the tcp layer working,
>>>I'm pretty much good to go.
>>>
>>>Any word on an SDP layer? I can probably modify the tcp layer quickly
>>>to do SDP, but I thought I would ask.
>>>
>>>-Mike
>>>
>>>Tim S. Woodall wrote:
>>>
>>>
>>>
>>>
>>>
>>>
>>>>Hello Mike,
>>>>
>>>>Mike Houston wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>When only sending a few messages, we get reasonably good IB performance,
>>>>>~500MB/s (MVAPICH is 850MB/s). However, if I crank the number of
>>>>>messages up, we drop to 3MB/s(!!!). This is with the OSU NBCL
>>>>>mpi_bandwidth test. We are running Mellanox IB Gold 1.8 with 3.3.3
>>>>>firmware on PCI-X (Couger) boards. Everything works with MVAPICH, but
>>>>>we really need the thread support in OpenMPI.
>>>>>
>>>>>Ideas? I noticed there are a plethora of runtime options configurable
>>>>>for mvapi. Do I need to tweak these to get performacne up?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>You might try running w/ the:
>>>>
>>>>mpirun -mca mpi_leave_pinned 1
>>>>
>>>>Which will cause mvapi port to maintain an mru cache of registrations,
>>>>rather than dynamically pinning/unpinning memory.
>>>>
>>>>If this does not resolve the BW problems, try increasing the
>>>>resources allocated to each connection:
>>>>
>>>>-mca btl_mvapi_rd_min 128
>>>>-mca btl_mvapi_rd_max 256
>>>>
>>>>Also can you forward me a copy of the test code or a reference to it?
>>>>
>>>>Thanks,
>>>>Tim
>>>>_______________________________________________
>>>>users mailing list
>>>>users_at_[hidden]
>>>>http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>_______________________________________________
>>>users mailing list
>>>users_at_[hidden]
>>>http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>>>
>>
>>_______________________________________________
>>users mailing list
>>users_at_[hidden]
>>http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>