Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Indirect calls to wait* and test*
From: Aurelien Bouteiller (bouteill_at_[hidden])
Date: 2007-12-03 12:38:51


You asked the exact same question in Paris, so I bet you don't
remember the discussions :)

We can not only use a callback on request completion (actually there
is already one, req_free is called anytime a request completes and can
be used to that purpose). We need to know wether the request have been
completed in a wait, waitany or waitsome as we have something
different to do depending on the context. More than that, we also need
to know how many and which requests finished in a particular waitsome/
any/test and be able to replay it. I have been looking for several
workaround to avoid the proposed patch and none I could think of
would do the job (thread condition between bottom and top layer
prevents holding requests from the bottom layer deterministically).
However, now that I have the implementation and can see both
performance and assembly diff, I am pleased by the very little
difference it makes. I even feel confident about this approach being
less harmful to performance than an extra callback.

Aurelien

Le 3 déc. 07 à 09:02, Jeff Squyres a écrit :

> Aurelien --
>
> I confess to forgetting some of the Paris discussion. :-\
>
> Could the same effect of these pointers also be effected by having a
> completion callback function pointer on the request? Or do you need
> more than that?
>
>
> On Nov 29, 2007, at 6:37 PM, Aurelien Bouteiller wrote:
>
>> This patch introduces customisable wait/test for requests as
>> discussed at the face-to-face ompi meeting in Paris.
>>
>> A new global structure (ompi_request_functions) holding all the
>> pointers to the wait/test functions have been added.
>> ompi_request_wait* and ompi_request_test* have been #defined to be
>> replaced by ompi_request_functions.req_wait. The default
>> implementations of the wait/test functions names have been changed
>> from ompi_request_% to ompi_request_default_%. Those functions are
>> static initializer of the ompi_request_functions structure.
>>
>> To modify the defaults, a components 1) copy the
>> ompi_request_functions structure (the type ompi_request_fns_t can be
>> used to declare a suitable variable), 2) change some of the
>> functions according to its needs. This is best done at MPI_init time
>> when there is no threads. Should this component be unloaded it have
>> to restore the defaults. The ompi_request_default_* functions should
>> never be called directly anywhere in the code. If a component needs
>> to access the previously defined implementation of wait, it should
>> call its local copy of the function. Component implementors should
>> keep in mind that another component might have already changed the
>> defaults and needs to be called.
>>
>> Performance impact on NetPipe -a (async recv mode) does not show
>> measurable overhead. Here follows the "diff -y" between original and
>> modified ompi assembly code from ompi/mpi/c/wait.c. The only
>> significant difference is an extra movl to load the address of the
>> ompi_request_functions structure in eax. This obviously explains why
>> there is no measurable cost on latency.
>>
>> ORIGINAL
>> MODIFIED
>>
>> L2: L2:
>> movl L_ompi_request_null$non_lazy_ptr-"L00000000001$pb"(%ebx),
>> %eax movl L_ompi_request_null
>> $non_lazy_ptr-"L00000000001$pb"(%ebx), %eax
>> cmpl %eax, (%edi) cmpl %eax, (%edi)
>> je L18 je L18
>> > movl L_ompi_request_functions
>> $non_lazy_ptr-"L00000000001$pb"(%ebx), %eax
>> movl %esi, 4(%esp) movl %esi, 4(%esp)
>> movl %edi, (%esp) movl %edi, (%esp)
>> call L_ompi_request_wait$stub | call *16(%eax)
>>
>> Here is the patch for those who want to try themselves.
>>
>> <custom_request_wait_and_test.patch>
>>
>>
>> If I receive comments outlining the need, thread safe accessors
>> could be added to allow components to change the functions at
>> anytime during execution and not only during MPI_Init/Finalize.
>> Please make noise if you find this useful.
>> If comments does not suggest extra work, I expect this code to be
>> committed in trunk next week.
>>
>> Aurelien
>>
>> Le 8 oct. 07 à 06:01, Aurelien Bouteiller a écrit :
>>
>>> For message logging purpose, we need to interface with wait_any,
>>> wait_some, test, test_any, test_some, test_all. It is not possible
>>> to
>>> use PMPI for this purpose. During the face-to-face meeting in Paris
>>> (5-12 october 2007) we discussed this issue and came to the
>>> conclusion that the best way to achieve this is to replace direct
>>> calls to ompi_request_wait* and test* by indirect calls (same way as
>>> PML send, recv, etc).
>>>
>>> Basic idea is to declare a static structure containing the 8
>>> pointers
>>> to all the functions. This structure is initialized at compilation
>>> time with the current basic wait/test functions. Before end of
>>> MPI_init, any component might replace the basics with specialized
>>> functions.
>>>
>>> Expected cost is less than .01us latency according to preliminary
>>> test. The method is consistent with the way we call pml send/recv.
>>> Mechanism could be used later for stripping out grequest from
>>> critical path when they are not used.
>>>
>>> --
>>> Aurelien Bouteiller, PhD
>>> Innovative Computing Laboratory - MPI group
>>> +1 865 974 6321
>>> 1122 Volunteer Boulevard
>>> Claxton Education Building Suite 350
>>> Knoxville, TN 37996
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> --
>> Dr. Aurelien Bouteiller, Sr. Research Associate
>> Innovative Computing Laboratory - MPI group
>> +1 865 974 6321
>> 1122 Volunteer Boulevard
>> Claxton Education Building Suite 350
>> Knoxville, TN 37996
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel