Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] RFC: ob1: fallback on put/send on rget failure
From: Nathan Hjelm (hjelmn_at_[hidden])
Date: 2012-03-15 17:07:49


What: Update ob1 to do the following:
        - fallback on send after rdma_put_retries_limit failures of prepare_dst
        - fallback on put (single non-pipelined) if the btl returns OMPI_ERR_NOT_AVAILABLE on a get transaction.

When: Timeout in about one week (Mar 22)

Why: Two reasons:
        - Some btls (ugni) need to switch to put for certain transactions. It makes sense to make this switch at the pml level.
        - If prepare_dst repeatedly fails for a get transaction we currently deadlock. We can avoid the deadlock (in most cases) by switching to send for the transaction.

Please take a look at the attached patch. Feedback and constructive criticism is needed!

-Nathan Hjelm
HPC-3, LANL