Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] memcpy MCA framework
From: Terry Dontje (Terry.Dontje_at_[hidden])
Date: 2008-08-16 11:51:02


George Bosilca wrote:
> The intent of the memcpy framework is to allow a selection between
> several memcpy at runtime. Of course, there will be a preselection at
> compile time, but all versions that can compile on a given
> architecture will be benchmarked at runtime and the best one will be
> selected. There is a file with several versions of memcpy for x86 (32
> and 64) somewhere around (I should have one if interested), that can
> be used as a starting point.
>
Ok, I guess I need to look at this code. I wonder if there may be cases
for Sun's machines in which this benchmark could end up picking the
wrong memcpy?
> The only thing we need is a volunteer to build the m4 magic. Figuring
> out what we can compile if kind of tricky, as some of the functions
> are in assembly, some others in C, and some others a mixture (the MMX
> headers).
>
Isn't the atomic code very similar? If I get to this point before
anyone else I probably will volunteer.

--td
> george.
>
> On Aug 16, 2008, at 3:19 PM, Terry Dontje wrote:
>
>> Hi Tim,
>> Thanks for bringing the below up and asking for a redirection to the
>> devel list. I think looking/using the MCA memcpy framework would be
>> a good thing to do and maybe we can work on this together once I get
>> out from under some commitments. However, some of the challenges
>> that originally scared me away from looking at the memcpy MCA is
>> whether we really want all the OMPI memcpy's to be replaced or just
>> specific ones. Also, I was concerned on trying to figure out which
>> version of memcpy I should be using. I believe currently things are
>> done such that you get one version based on which system you compile
>> on. For Sun there may be several different SPARC platforms that
>> would need to use different memcpy code but we would like to just
>> ship one set of bits.
>> Not saying the above not doable under the memcpy MCA framework just
>> that it somewhat scared me away from thinking about it at first glance.
>>
>> --td
>>> Date: Fri, 15 Aug 2008 12:08:18 -0400 From: "Tim Mattox"
>>> <timattox_at_[hidden]> Subject: Re: [OMPI users] SM btl slows down
>>> bandwidth? To: "Open MPI Users" <users_at_[hidden]> Message-ID:
>>> <ea86ce220808150908t62818a21k32c49b9b6f07dca_at_[hidden]>
>>> Content-Type: text/plain; charset=ISO-8859-1 Hi Terry (and others),
>>> I have previously explored this some on Linux/X86-64 and concluded
>>> that Open MPI needs to supply it's own memcpy routine to get good sm
>>> performance, since the memcpy supplied by glibc is not even close to
>>> optimal. We have an unused MCA framework already set up to supply an
>>> opal_memcpy. AFAIK, George and Brian did the original work to set up
>>> that framework. It has been on my to-do list for awhile to start
>>> implementing opal_memcpy components for the architectures I have
>>> access to, and to modify OMPI to actually use opal_memcpy where ti
>>> makes sense. Terry, I presume what you suggest could be dealt with
>>> similarly when we are running/building on SPARC. Any followup
>>> discussion on this should probably happen on the developer mailing
>>> list. On Thu, Aug 14, 2008 at 12:19 PM, Terry Dontje
>>> <Terry.Dontje_at_[hidden]> wrote:
>>>> > Interestingly enough on the SPARC platform the Solaris memcpy's
>>>> actually use
>>>> > non-temporal stores for copies >= 64KB. By default some of the mca
>>>> > parameters to the sm BTL stop at 32KB. I've done
>>>> experimentations of
>>>> > bumping the sm segment sizes to above 64K and seen incredible
>>>> speedup on our
>>>> > M9000 platforms. I am looking for some nice way to integrate a
>>>> memcpy that
>>>> > lowers this boundary to 32KB or lower into Open MPI.
>>>> > I have not looked into whether Solaris x86/x64 memcpy's use the
>>>> non-temporal
>>>> > stores or not.
>>>> >
>>>> > --td
>>>>
>>>>> >>
>>>>> >> Message: 1
>>>>> >> Date: Thu, 14 Aug 2008 09:28:59 -0400
>>>>> >> From: Jeff Squyres <jsquyres_at_[hidden]>
>>>>> >> Subject: Re: [OMPI users] SM btl slows down bandwidth?
>>>>> >> To: rbbrigh_at_[hidden], Open MPI Users <users_at_[hidden]>
>>>>> >> Message-ID: <562557EB-857C-4CA8-97AD-F294C7FEDC77_at_[hidden]>
>>>>> >> Content-Type: text/plain; charset=US-ASCII; format=flowed;
>>>>> delsp=yes
>>>>> >>
>>>>> >> At this time, we are not using non-temporal stores for shared
>>>>> memory
>>>>> >> operations.
>>>>> >>
>>>>> >>
>>>>> >> On Aug 13, 2008, at 11:46 AM, Ron Brightwell wrote:
>>>>> >>
>>>>> >>
>>>>>
>>>>>>> >>>>
>>>>>>>
>>>>>>>>> >>>> >> [...]
>>>>>>>>> >>>> >>
>>>>>>>>> >>>> >> MPICH2 manages to get about 5GB/s in shared memory
>>>>>>>>> performance on the
>>>>>>>>> >>>> >> Xeon 5420 system.
>>>>>>>>>
>>>>>>> >>>>
>>>>>>>
>>>>>> >>>
>>>>>>
>>>>>>> >>> >
>>>>>>> >>> > Does the sm btl use a memcpy with non-temporal stores like
>>>>>>> MPICH2?
>>>>>>> >>> > This can be a big win for bandwidth benchmarks that don't
>>>>>>> actually
>>>>>>> >>> > touch their receive buffers at all...
>>>>>>> >>> >
>>>>>>> >>> > -Ron
>>>>>>> >>> >
>>>>>>> >>> >
>>>>>>> >>> > _______________________________________________
>>>>>>> >>> > users mailing list
>>>>>>> >>> > users_at_[hidden]
>>>>>>> >>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>> >>>
>>>>>>
>>>>> >>
>>>>> >>
>>>>> >> -- Jeff Squyres Cisco Systems
>>>>>
>>>> >
>>>> > _______________________________________________
>>>> > users mailing list
>>>> > users_at_[hidden]
>>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> >
>>>>
>>>
>>>
>>>
>>> -- Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
>>> tmattox_at_[hidden] || timattox_at_[hidden] I'm a bright...
>>> http://www.the-brights.net/
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>