Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: ob1: fallback on put/send on rget failure
From: Shamis, Pavel (shamisp_at_[hidden])
Date: 2012-03-15 17:14:22


Nathan,

I did not get any patch.

Regards,

Pavel (Pasha) Shamis

---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory
On Mar 15, 2012, at 5:07 PM, Nathan Hjelm wrote:
> 
> 
> What: Update ob1 to do the following:
>        - fallback on send after rdma_put_retries_limit failures of prepare_dst
>        - fallback on put (single non-pipelined) if the btl returns OMPI_ERR_NOT_AVAILABLE on a get transaction.
> 
> When: Timeout in about one week (Mar 22)
> 
> Why: Two reasons:
>        - Some btls (ugni) need to switch to put for certain transactions. It makes sense to make this switch at the pml level.
>        - If prepare_dst repeatedly fails for a get transaction we currently deadlock. We can avoid the deadlock (in most cases) by switching to send for the transaction.
> 
> Please take a look at the attached patch. Feedback and constructive criticism is needed!
> 
> -Nathan Hjelm
> HPC-3, LANL