Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Indirect calls to wait* and test*
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-12-03 13:16:19


Right ok, I remember now.

Thanks!

On Dec 3, 2007, at 12:38 PM, Aurelien Bouteiller wrote:

> You asked the exact same question in Paris, so I bet you don't
> remember the discussions :)
>
> We can not only use a callback on request completion (actually there
> is already one, req_free is called anytime a request completes and can
> be used to that purpose). We need to know wether the request have been
> completed in a wait, waitany or waitsome as we have something
> different to do depending on the context. More than that, we also need
> to know how many and which requests finished in a particular waitsome/
> any/test and be able to replay it. I have been looking for several
> workaround to avoid the proposed patch and none I could think of
> would do the job (thread condition between bottom and top layer
> prevents holding requests from the bottom layer deterministically).
> However, now that I have the implementation and can see both
> performance and assembly diff, I am pleased by the very little
> difference it makes. I even feel confident about this approach being
> less harmful to performance than an extra callback.
>
> Aurelien
>
> Le 3 déc. 07 à 09:02, Jeff Squyres a écrit :
>
>> Aurelien --
>>
>> I confess to forgetting some of the Paris discussion. :-\
>>
>> Could the same effect of these pointers also be effected by having a
>> completion callback function pointer on the request? Or do you need
>> more than that?
>>
>>
>> On Nov 29, 2007, at 6:37 PM, Aurelien Bouteiller wrote:
>>
>>> This patch introduces customisable wait/test for requests as
>>> discussed at the face-to-face ompi meeting in Paris.
>>>
>>> A new global structure (ompi_request_functions) holding all the
>>> pointers to the wait/test functions have been added.
>>> ompi_request_wait* and ompi_request_test* have been #defined to be
>>> replaced by ompi_request_functions.req_wait. The default
>>> implementations of the wait/test functions names have been changed
>>> from ompi_request_% to ompi_request_default_%. Those functions are
>>> static initializer of the ompi_request_functions structure.
>>>
>>> To modify the defaults, a components 1) copy the
>>> ompi_request_functions structure (the type ompi_request_fns_t can be
>>> used to declare a suitable variable), 2) change some of the
>>> functions according to its needs. This is best done at MPI_init time
>>> when there is no threads. Should this component be unloaded it have
>>> to restore the defaults. The ompi_request_default_* functions should
>>> never be called directly anywhere in the code. If a component needs
>>> to access the previously defined implementation of wait, it should
>>> call its local copy of the function. Component implementors should
>>> keep in mind that another component might have already changed the
>>> defaults and needs to be called.
>>>
>>> Performance impact on NetPipe -a (async recv mode) does not show
>>> measurable overhead. Here follows the "diff -y" between original and
>>> modified ompi assembly code from ompi/mpi/c/wait.c. The only
>>> significant difference is an extra movl to load the address of the
>>> ompi_request_functions structure in eax. This obviously explains why
>>> there is no measurable cost on latency.
>>>
>>> ORIGINAL
>>> MODIFIED
>>>
>>> L2: L2:
>>> movl L_ompi_request_null$non_lazy_ptr-"L00000000001$pb"(%ebx),
>>> %eax movl L_ompi_request_null
>>> $non_lazy_ptr-"L00000000001$pb"(%ebx), %eax
>>> cmpl %eax, (%edi) cmpl %eax, (%edi)
>>> je L18 je L18
>>> > movl L_ompi_request_functions
>>> $non_lazy_ptr-"L00000000001$pb"(%ebx), %eax
>>> movl %esi, 4(%esp) movl %esi, 4(%esp)
>>> movl %edi, (%esp) movl %edi, (%esp)
>>> call L_ompi_request_wait$stub | call *16(%eax)
>>>
>>> Here is the patch for those who want to try themselves.
>>>
>>> <custom_request_wait_and_test.patch>
>>>
>>>
>>> If I receive comments outlining the need, thread safe accessors
>>> could be added to allow components to change the functions at
>>> anytime during execution and not only during MPI_Init/Finalize.
>>> Please make noise if you find this useful.
>>> If comments does not suggest extra work, I expect this code to be
>>> committed in trunk next week.
>>>
>>> Aurelien
>>>
>>> Le 8 oct. 07 à 06:01, Aurelien Bouteiller a écrit :
>>>
>>>> For message logging purpose, we need to interface with wait_any,
>>>> wait_some, test, test_any, test_some, test_all. It is not possible
>>>> to
>>>> use PMPI for this purpose. During the face-to-face meeting in Paris
>>>> (5-12 october 2007) we discussed this issue and came to the
>>>> conclusion that the best way to achieve this is to replace direct
>>>> calls to ompi_request_wait* and test* by indirect calls (same way
>>>> as
>>>> PML send, recv, etc).
>>>>
>>>> Basic idea is to declare a static structure containing the 8
>>>> pointers
>>>> to all the functions. This structure is initialized at compilation
>>>> time with the current basic wait/test functions. Before end of
>>>> MPI_init, any component might replace the basics with specialized
>>>> functions.
>>>>
>>>> Expected cost is less than .01us latency according to preliminary
>>>> test. The method is consistent with the way we call pml send/recv.
>>>> Mechanism could be used later for stripping out grequest from
>>>> critical path when they are not used.
>>>>
>>>> --
>>>> Aurelien Bouteiller, PhD
>>>> Innovative Computing Laboratory - MPI group
>>>> +1 865 974 6321
>>>> 1122 Volunteer Boulevard
>>>> Claxton Education Building Suite 350
>>>> Knoxville, TN 37996
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> --
>>> Dr. Aurelien Bouteiller, Sr. Research Associate
>>> Innovative Computing Laboratory - MPI group
>>> +1 865 974 6321
>>> 1122 Volunteer Boulevard
>>> Claxton Education Building Suite 350
>>> Knoxville, TN 37996
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems