Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Potential ob1 bug
From: George Bosilca (bosilca_at_[hidden])
Date: 2012-05-03 09:22:33


Nathan,

You're right, when we loop trying to restart a failed request we must reset the convertor. However:
1. the position in this case is always zero, so we don't have to save the previous position in order to restore it.
2. all cases must be protected, not only the mca_pml_ob1_send_request_start_prepare.

Therefore there are 2 ways of doing this without affecting performance. We remove the initialization of the convertor in the send_init function, and force the convertor position to zero on every call to mca_pml_ob1_send_request_start_btl. Or, we reset the convertor position to zero every time we withdraw a failed request from the pending queue in order to restart it (in mca_pml_ob1_progress and in mca_pml_ob1_send_request_process_pending).

However, with the multiplication of send strategies in the OB1 (CUDA as an example) someone else should take a look in that code to apply the same logic.

Please find below a patch implementing the second method.

  george.

Index: ompi/mca/pml/ob1/pml_ob1_progress.c
===================================================================
--- ompi/mca/pml/ob1/pml_ob1_progress.c (revision 26381)
+++ ompi/mca/pml/ob1/pml_ob1_progress.c (working copy)
@@ -53,6 +53,7 @@
             completed_requests++;
             break;
         case MCA_PML_OB1_SEND_PENDING_START:
+ MCA_PML_OB1_SEND_REQUEST_RESET(sendreq);
             endpoint = sendreq->req_endpoint;
             send_succedded = false;
             for(j = 0; j < (int)mca_bml_base_btl_array_get_size(&endpoint->btl_eager); j++) {
Index: ompi/mca/pml/ob1/pml_ob1_sendreq.c
===================================================================
--- ompi/mca/pml/ob1/pml_ob1_sendreq.c (revision 26381)
+++ ompi/mca/pml/ob1/pml_ob1_sendreq.c (working copy)
@@ -70,7 +70,8 @@
                 add_request_to_send_pending(sendreq,
                         MCA_PML_OB1_SEND_PENDING_START, true);
             } else {
- rc = mca_pml_ob1_send_request_start_btl(sendreq, send_dst);
+ MCA_PML_OB1_SEND_REQUEST_RESET(sendreq);
+ rc = mca_pml_ob1_send_request_start_btl(sendreq, send_dst);
                 if (OMPI_ERR_OUT_OF_RESOURCE == rc) {
                     /* No more resources on this btl so prepend to the pending
                      * list to minimize reordering and give up for now. */
Index: ompi/mca/pml/ob1/pml_ob1_sendreq.h
===================================================================
--- ompi/mca/pml/ob1/pml_ob1_sendreq.h (revision 26381)
+++ ompi/mca/pml/ob1/pml_ob1_sendreq.h (working copy)
@@ -159,6 +159,13 @@
         (sendreq)->req_recv.pval = NULL; \
     }
 
+#define MCA_PML_OB1_SEND_REQUEST_RESET(sendreq) \
+{ \
+ size_t _position = 0; \
+ opal_convertor_set_position(&sendreq->req_send.req_base.req_convertor, \
+ &_position); \
+ assert( 0 == _position ); \
+}
 
 static inline void mca_pml_ob1_free_rdma_resources(mca_pml_ob1_send_request_t* sendreq)
 {

On May 1, 2012, at 10:55 , Hjelm, Nathan T wrote:

> Ran across a problem in a failure path of start_prepare in ob1. If prepare_src succeed but send fails the send request convertor needs to be rolled back to the correct position. Can someone with more knowledge of ob1 check if this is indeed an error. Patch is below.
>
> -Nathan
>
> diff --git a/ompi/mca/pml/ob1/pml_ob1_sendreq.c b/ompi/mca/pml/ob1/pml_ob1_sendreq.c
> index 2a8ac03..5505918 100644
> --- a/ompi/mca/pml/ob1/pml_ob1_sendreq.c
> +++ b/ompi/mca/pml/ob1/pml_ob1_sendreq.c
> @@ -570,6 +570,7 @@ int mca_pml_ob1_send_request_start_prepare( mca_pml_ob1_send_request_t* sendreq,
> mca_bml_base_btl_t* bml_btl,
> size_t size )
> {
> + size_t old_position = sendreq->req_send.req_base.req_convertor.bConverted;
> mca_btl_base_descriptor_t* des;
> mca_btl_base_segment_t* segment;
> mca_pml_ob1_hdr_t* hdr;
> @@ -614,6 +615,9 @@ int mca_pml_ob1_send_request_start_prepare( mca_pml_ob1_send_request_t* sendreq,
> return OMPI_SUCCESS;
> }
> mca_bml_base_free(bml_btl, des );
> +
> + opal_convertor_set_position(&sendreq->req_send.req_base.req_convertor,
> + &old_position);
> return rc;
> }
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel