Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: ob1: fallback on put/send on rget failure
From: Jeffrey Squyres (jsquyres_at_[hidden])
Date: 2012-03-19 09:32:28


George / Brian --

Can you guys comment on this patch?

On Mar 15, 2012, at 5:07 PM, Nathan Hjelm wrote:

> What: Update ob1 to do the following:
> - fallback on send after rdma_put_retries_limit failures of prepare_dst
> - fallback on put (single non-pipelined) if the btl returns OMPI_ERR_NOT_AVAILABLE on a get transaction.
>
> When: Timeout in about one week (Mar 22)
>
> Why: Two reasons:
> - Some btls (ugni) need to switch to put for certain transactions. It makes sense to make this switch at the pml level.
> - If prepare_dst repeatedly fails for a get transaction we currently deadlock. We can avoid the deadlock (in most cases) by switching to send for the transaction.
>
> Please take a look at the attached patch. Feedback and constructive criticism is needed!
>
> -Nathan Hjelm
> HPC-3, LANL<ompi_trunk_ob1_get_fallback.patch.gz>_______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/