Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [EXTERNAL] Re: RFC: ob1: fallback on put/send on rget failure
From: Barrett, Brian W (bwbarre_at_[hidden])
Date: 2012-03-19 09:44:32

I'm not sure I'm the best one to comment on OB1 these days, but I didn't
see anything obviously wrong.


On 3/19/12 9:32 AM, "Jeffrey Squyres" <jsquyres_at_[hidden]> wrote:

>George / Brian --
>Can you guys comment on this patch?
>On Mar 15, 2012, at 5:07 PM, Nathan Hjelm wrote:
>> What: Update ob1 to do the following:
>> - fallback on send after rdma_put_retries_limit failures of
>> - fallback on put (single non-pipelined) if the btl returns
>>OMPI_ERR_NOT_AVAILABLE on a get transaction.
>> When: Timeout in about one week (Mar 22)
>> Why: Two reasons:
>> - Some btls (ugni) need to switch to put for certain
>>transactions. It makes sense to make this switch at the pml level.
>> - If prepare_dst repeatedly fails for a get transaction we
>>currently deadlock. We can avoid the deadlock (in most cases) by
>>switching to send for the transaction.
>> Please take a look at the attached patch. Feedback and constructive
>>criticism is needed!
>> -Nathan Hjelm
>> HPC-3,
>> devel mailing list
>> devel_at_[hidden]
>Jeff Squyres
>For corporate legal information go to:
>devel mailing list

  Brian W. Barrett
  Dept. 1423: Scalable System Software
  Sandia National Laboratories